Re: [PATCH 6/8] cli: add support for batch tagging operations to "notmuch tag"

Subject: Re: [PATCH 6/8] cli: add support for batch tagging operations to "notmuch tag"

Date: Wed, 04 Apr 2012 07:55:24 -0300

To: Jameson Graef Rollins, Jani Nikula

Cc: Notmuch Mail

From: David Bremner


Jameson Graef Rollins <jrollins@finestructure.net> writes:

> With that in mind, I think I stand by my suggestion that the form should
> match exactly the notmuch subcommand format.  Even considering the
> technical issues that Jani brought up, I still think it makes the most
> sense to imagine generic batch processing handled by the top level
> binary.  And in that case the most logical format for the input is
> probably just that of the CLI arguments.

One thing that worries me about this (and to be honest it worries me a
bit about the single character command tag) is the potential increase in
size of a dump file, if we use exactly a list of commands as a dump
format. The SQL/XML-like argument that it will all compress well is
true; nontheless for applications involving version control, it does
seem useful to have an uncompressed version around.

A very rough estimate suggests for my about 250k messages, appending
"tag " to the front of each line bloats a dump file by about 5%. Maybe
that is not worth worrying about. I'd be curious to see how 4 * #lines /
(total dump size) works out for other people.  I thought that the bloat
from having + in front of every tag would be larger, but it seems that
my messages average something like one tag per message (many messages
with no tags). I'm not sure how universal that is.

We could also give up on marking the command on each line, and insert
some kind of simple header at the top. This idea came up in the context
of restore formats before.

> Just out of curiosity and for the sake of argument, if we were going to
> design a server/batch processor from the ground up would it make sense
> to use a format like this, or would we better off opting for some other
> more established protocol?

I guess it depends how much work it is to support the established
protocol, and how good the fit is with notmuch.  Are there candidates
other than IMAP? 

As far as implementation effort, as a totally unscientific experiment, I
grabbed Net::IMAP::Server from CPAN, it is almost 7000 lines of
perl. I'm not suggesting we use Perl ;), but I doubt C is
shorter. Hopefully we wouldn't write such a library from scratch.  A
quick search did not lead me to "the canonical imap server library",
unless that is the UW one, which I have bad, if non-specific memories
about.

I think we'd need to use a fair number of extensions to basic IMAP. What
might work well is the GMail extensions to IMAP. I have no idea about
the difficulty of implementing those; I suspect there are not solid C
libraries supporting them.

d

Thread: