Re: [PATCH v2 2/7] lib: add a date/time parser module

Subject: Re: [PATCH v2 2/7] lib: add a date/time parser module

Date: Sun, 05 Aug 2012 10:08:58 -0300

To: Jani Nikula, notmuch@notmuchmail.org

Cc:

From: David Bremner


Jani Nikula <jani@nikula.org> writes:

> +
> +static enum field
> +abs_to_rel_field (enum field field)
> +{
> +    assert (field <= TM_ABS_YEAR);
> +
> +    /* note: depends on the enum ordering */
> +    return field + (TM_REL_SEC - TM_ABS_SEC);
> +}
> +

I wonder if this would be slightly nicer of you defined a TM_FIRST_REL
or so as a synonym like TM_NONE and TM_SIZE

> +/* get zero value for field */
> +static int
> +field_zero (enum field field)
> +{
> +    if (field == TM_ABS_MDAY || field == TM_ABS_MON)
> +	return 1;
> +    else if (field == TM_ABS_YEAR)
> +	return 1970;
> +    else
> +	return 0;
> +}

what do you think about using the word "epoch" instead of zero here?

> +static bool
> +get_postponed_number (struct state *state, int *v, int *n, char *d)
> +{

I found the 1 letter names not quite obvious here.

At this point reading the code, I have not trouble understanding each
line/function, but I feel like I'm missing the big picture a bit. 
What is a postponed number?

> +    /*
> +     * REVISIT: There could be a "next_field" that would be set from
> +     * "field" for the duration of the handle_postponed_number() call,
> +     * so it has more information to work with.
> +     */

The notmuch convention seems to be to use XXX: for this. I'm not sure
I'd bother changing, especially if we can't decide how to package this.

> +/* Time set helper. No input checking. Use UNSET (-1) to leave unset. */
> +static int
> +set_abs_time (struct state *state, int hour, int min, int sec)
> +{
> +    int r;
> +
> +    if (hour != UNSET) {
> +	if ((r = set_field (state, TM_ABS_HOUR, hour)))
> +	    return r;
> +    }

So for this function and the next, the first match wins? I don't really
see the motivation for this, maybe you can explain a bit.


> +    /* timezone codes: offset in minutes. FIXME: add more codes. */

Did you think about trying to delegate the list of timezones to the
system?

> + * Compare strings s and keyword. Return number of matching chars on
> + * match, 0 for no match. Match must be at least n chars (n == 0 all
> + * of keyword), otherwise it's not a match. Use match_case for case
> + * sensitive matching.
> + */

I guess that's fine, and it is internal, but maybe -1 for whole string
would be slightly nicer (although I can't imagine what good matching 0
length strings is at the moment).

> +	/* Minimum match length. */
> +	p = strchr (keyword, '|');
> +	if (p) {
> +	    minlen = p - keyword;
> +	    memmove (p, p + 1, strlen (p + 1) + 1);
> +	}

Something about that memmove creeps me out, but I trust you that it's
correct. Alternatively I guess you could represent keywords as pairs of
strings, which is probably more of a pain.


> +
> +/* Parse a single number. Typically postpone parsing until later. */

OK, so I finally start to understand what a postponed number is :)
I understand the compiler likes bottom up declarations, but some
top down declarations/comments are needed I think.

> +static int
> +parse_date (struct state *state, char sep,
> +	    unsigned long v1, unsigned long v2, unsigned long v3,
> +	    size_t n1, size_t n2, size_t n3)
> +{
> +    int year = UNSET, mon = UNSET, mday = UNSET;
> +
> +    assert (is_date_sep (sep));
> +
> +    switch (sep) {
> +    case '/': /* Date: M[M]/D[D][/YY[YY]] or M[M]/YYYY */

If I understand correctly, this chooses between American (?) month, day,
year ordering and "sensible" day, month, year ordering by delimiter. I
never thought about this as a way to tell (I often write D/M/Y), but
that could be just me. I agree it's fine as a convention.

> +/*
> + * Parse delimiter(s). Return < 0 on error, number of parsed chars on
> + * success.
> + */

So 1:-2 will parse as 1-2 ?, i.e. last delimiter wins? Maybe better to
say so explicitly.

> +/* Combine absolute and relative fields, and round. */
> +static int
> +create_output (struct state *state, time_t *t_out, const time_t *tnow,
> +	       int round)
> +{

It seems like most of non-obvious logic like (when is "wednesday") is
encoded here. From a maintenence point of view, it would be nice to be
able to seperate out the heuristic stuff from the mechanical, to the
degree that it is possible.

d

Thread: