Re: [RFC PATCH 00/14] modular mail stores based on URIs

Subject: Re: [RFC PATCH 00/14] modular mail stores based on URIs

Date: Sun, 1 Jul 2012 12:02:08 -0400

To: Mark Walters

Cc: notmuch@notmuchmail.org

From: Ethan


Thanks for going through it, I know there's a lot to go through..

On Thu, Jun 28, 2012 at 4:45 PM, Mark Walters <markwalters1009@gmail.com>wrote:

> I was thinking of just having one mail root and inside that there could
> be maildirs and mboxes. Everything would still be relative to the root.
>

I'm hesitant to have directories that contain maildirs and mboxes. It
should be possible to unambiguously distinguish between a maildir file and
an mbox file (mboxes always start with "From ", no colon) but it sounds
kind of fragile.

>  1. Are URIs the way to specify individual messages, despite bremner's
> >  concerns about too much of the API being strings? Is adding another
> library
> >  is the easiest way to parse URIs?
>
> In my opinion  the nice thing about using strings is that it does not
> require
> any changes to the Xapian database to store them. I think using URIs may
> not be best though as they seem to be annoying to parse (as filenames
> can contain the same characters) and you seem to need to work around the
> parser in some cases.
>

I think that's more the fault of the parser than of the URIs. If glib came
with a parser, that would be great. There aren't a lot of options for
pure-C URI parsing. Besides uriparser, there's also some code in the W3C
sample code library, but it looked like integrating it would be a pain so I
let it go.

I wonder if the following would be practical: use // as the field
> separator:
>
> e.g. mbox://filename//start_of_message+length
>
> I think 2 consecutive slashes // is about the only thing we can assume
> is not in the path or filename. Since it is not in the filename I think
> parsing should be trivial (thus avoiding the extra library).
>

Can you explain what you mean when you say that two consecutive slashes
can't appear in a URL? Ordinary filesystem paths can contain them, and so
can file: URLs. (I just looked up file:///home/ethan///////tmp and Firefox
handled that OK.) I've sometimes seen machine-generated filenames with
double slashes because that way you don't have to make sure the incoming
filename was correctly terminated before adding another level.


> Secondly, I would prefer to keep maildirs as just the bare file name: so
> the existence of // can be the signal that there is some other
> scheme. This is asymmetric, but is rather more backwardly compatible.
>

Based on your and Jani's reasoning, I did this. Revised patch series
follows.

Ethan

Thread: