Re: query on a subset of messages ?

Subject: Re: query on a subset of messages ?

Date: Mon, 9 Jul 2012 12:30:00 -0400

To: Sebastien Binet

Cc: Notmuch developer list

From: Austin Clements


Quoth Sebastien Binet on Jul 09 at 10:25 am:
> 
> hi there,
> 
> I was trying to reduce the I/O stress during my usual email
> fetching+tagging by writing a little program using the go bindings to
> notmuch.
> 
> ie:
> db, status := notmuch.OpenDatabase(db_path,
>     		notmuch.DATABASE_MODE_READ_WRITE)
> query := db.CreateQuery("(tag:new AND tag:inbox)")
> msgs := query.SearchMessages()
> for _,msg := range msgs {
>   tag_msg(msg, tagqueries)
> }
> 
> 
> where tagqueries is a subquery of the form:
> [
>     {
>         "Cmd": "+to-me",
>         "Query": "(to:sebastien.binet@cern.ch and not tag:to-me)"
>     },
>     {
>         "Cmd": "+sci-notmuch",
>         "Query": "from:notmuch@notmuchmail.org or to:notmuch@notmuchmail.org or subject:notmuch"
>     }
> ]
> 
> 
> the idea being that I only need to crawl through the db only once and
> then iteratively apply tags on those messages (instead of repeatedly
> running "notmuch tag ..." for each and every of those many
> 'tag-queries')
> 
> I couldn't find any C-API to do such a thing using the notmuch library.
> did I overlook something ?
> 
> Is it something useful to add ?
> 
> -s

Have you tried a more direct translation of the multiple notmuch tag
commands into Go, where you don't worry about subsetting the queries?
Unless you're tagging a huge number of messages, the cost of notmuch
tag is almost certainly the fsync that it does when it closes the
database (which every call to notmuch tag must do).  However, in Go,
you can keep the database open across all of the tagging operations
and then close and fsync it just once.

Note that there is an important optimization in notmuch tag that you
might have to replicate.  It manipulates the original query to exclude
messages that already have the desired tags, so that they get skipped
very efficiently at the earliest stage possible.

Thread: