Re: [RFC2 Patch 5/5] lib: iterator API for message properties

Subject: Re: [RFC2 Patch 5/5] lib: iterator API for message properties

Date: Fri, 03 Jun 2016 20:12:52 -0300

To: Daniel Kahn Gillmor, notmuch@notmuchmail.org

Cc:

From: David Bremner


Daniel Kahn Gillmor <dkg@fifthhorseman.net> writes:

> [ Unknown signature status ]
> On Fri 2016-06-03 08:54:00 -0400, David Bremner wrote:
>> Sure, where do you think that kind of documentation is appropriate?
>> There is the giant comment about the database schema in
>> lib/database.cc. Actually I just noticed I already failed to update that
>> for libconfig stuff.
>
> That comment seems OK, but it won't be exposed to the people who are in
> that middle range (python or ruby programmers but not C programmers).
> Do we have a place for this kind of mid-level documenation?

The simplest solution is probably API documentation itself
(lib/notmuch.h), which should propagage to the bindings documentation.
Maybe I'll start with that, and we can go from there.

>
>> [ dkg wrote: ]
>>>  * for messages which have multiple files, which file is actually indexed
>>
>> yes. Although rather than storing that, I think the right answer is more
>> like "all of them".
>
> I don't think we do this, do we?  Is this a bug?  is it tracked somewhere?

IMHO it is a bug. It's implicit in

   id:87k42vrqve.fsf@pip.fifthhorseman.net

and the various requests for List-Id indexing, but it's probably worth
starting a seperate thread to track it.  Especially since there are some
unresolved design issues. Like what to return for searches.

> This is exactly my point -- i don't care about reproducibility of the
> exact numbering, but , the thread-id is *not* reproducible from the
> message sets.  This is not only because of the ghost message leakage bug
> documented in T590-thread-breakage.sh, but also because threads can be
> joined by a message that is later removed (e.g., the "notmuch-join"
> script in id:87egabu5ta.fsf@alice.fifthhorseman.net ).

I see, I guess that's the intended behaviour given 604d1e0977c.

I haven't thought about the pros and cons of dumping/restoring
thread-ids. At least my database has about half as many threads as
messages, so it's a bit of data, but perhaps that's not a bit problem.
It's somewhat orthogonal to this series since those terms are already
attached to messages.

>> I'm not sure what you have in mind, something more ambitious than the
>> header added post 0.22?
>
> Can you point me to the definition for that header?  i still don't
> understand what the batch-tag:2 part means.  (sorry i haven't been
> keeping up with the master branch lately!)
>

Currently there's just the source: it says which format, and with that
format, which subset of output.

static void
print_dump_header (gzFile output, int output_format, int include)
{
    gzprintf (output, "#notmuch-dump %s:%d %s%s%s\n",
	      (output_format == DUMP_FORMAT_SUP) ? "sup" : "batch-tag",
	      NOTMUCH_DUMP_VERSION,
	      (include & DUMP_INCLUDE_CONFIG) ? "config" : "",
	      (include & DUMP_INCLUDE_TAGS) && (include & DUMP_INCLUDE_CONFIG) ? "," : "",
	      (include & DUMP_INCLUDE_TAGS) ? "tags" : "");
}


Thread: