Re: [PATCH] Add configurable changed tag to messages that have been changed on disk

Subject: Re: [PATCH] Add configurable changed tag to messages that have been changed on disk

Date: Wed, 23 Apr 2014 17:28:43 -0400

To: David Mazieres expires 2014-07-05 CEST

Cc: notmuch@notmuchmail.org

From: Austin Clements


Quoth David Mazieres on Apr 06 at 10:19 pm:
> Gaute Hope <eg@gaute.vetsj.com> writes:
> 
> > When one of the source files for a message is changed on disk, renamed,
> > deleted or a new source file is added. A configurable changed tag is
> > is added. The tag can be configured under the option 'changed_tags' in
> > the [new] section, the default is none. Tests have been updated to
> > accept the new config option.
> >
> > notmuch-setup now asks for a changed tag after the new tags question.
> >
> > This could be useful for for example 'afew' to detect remote changes in
> > IMAP folders and update the FolderNameFilter to also add tags or remove
> > tags when a _existing_ message has been added to or removed from a
> > maildir.
> 
> I think this is the wrong way to achieve such functionality, because
> then the change tag A) is expensive to remove, B) is easy to misuse
> (remember to call fsync everywhere before deleting the change tag), and
> C) can be used by only one application.
> 
> A better approach would be to add a new "modtime" xapian value that is
> updated whenever the tags or any other terms (such as XFDIRENTRY) are
> added to or deleted from a docid.  If it's a Xapian value, rather than a
> term, then modtime will be queriable just like date, allowing multiple
> applications to query all docids modified since the last time they ran.

I'd like to have efficient change detection, too.  In my case, I'd
like to use it to support efficient live search and show updates.  The
design I'd sketched out for that used a log rather than ctimes, and
I'm curious if you have thoughts on the relative merits and
suitability for tag sync of these approaches.

I'd been leaning toward logging because it can capture changes to
things that aren't represented as documents in the database, such as
thread membership.  This probably doesn't matter for synchronization,
but it makes it much easier to figure out which threads in thread
search results need to be refreshed.  A log can also capture message
deletion easily, while ctimes would require tombstones (which may be a
good idea for other reasons [1]).

On the other hand, logging is obviously more mechanism than ctimes,
and probably requires some garbage collection story.

[1] id:20140421162058.GE25817@mit.edu

Thread: