[PATCH 0/5] notmuch batch count

Subject: [PATCH 0/5] notmuch batch count

Date: Tue, 15 Jan 2013 23:54:42 +0200

To: notmuch@notmuchmail.org

Cc:

From: Jani Nikula


Hi all -

Notmuch remote usage [1] is a pretty handy way of accessing a notmuch
database on a remote server. However, the more you have saved searches
and tags, the slower notmuch-hello becomes, and it ends up being by and
far the biggest usability issue with remote notmuch. This is because
notmuch-hello issues a separate 'notmuch count' for each saved search
and tag.

One could argue that notmuch-hello should be fixed somehow, but I chose
to try another route: batch support for notmuch count. This enables
notmuch-hello to get the counts for all the saved searches or tags in a
single call. The performance improvement is huge in remote usage, but
it's not limited to that. Regular local usage benefits from it too, but
it's not as obviously noticeable.

Here's a script that demonstrates one-by-one count vs. batch count,
locally and over ssh (assuming ssh key authentication is set up), over
10 iterations:

#!/bin/bash

echo "tag count:"
notmuch search --output=tags "*" | wc -l

for remote in "" "ssh example.com"; do
    export remote
    echo "one-by-one count:"
    time sh -c 'for i in `seq 10`; do notmuch search --format=text0 --output=tags "*" | xargs -0 -n 1 -I "{}" $remote notmuch count tag:"{}" > /dev/null; done'

    echo "batch count:"
    time sh -c 'for i in `seq 10`; do notmuch search --format=text --output=tags "*" | sed "s/.*/tag:\"\0\"/" | $remote notmuch count --batch > /dev/null; done'
done

And here's the output of it in my setup:

tag count:
36
one-by-one count:

real	0m2.349s
user	0m0.552s
sys	0m0.868s
batch count:

real	0m0.179s
user	0m0.120s
sys	0m0.064s
one-by-one count:

real	0m56.527s
user	0m1.424s
sys	0m1.164s
batch count:

real	0m2.407s
user	0m0.068s
sys	0m0.040s

As can be seen, in local usage (the first pair of results) the speedup
is more than 10x, although one-by-one notmuch count is usually
sufficiently fast. The difference is more noticeable in remote use (the
second pair of results), where the speedup is 20x here, and any
additional, occasional network latency is multiplied by tag count. (That
result is actually faster than usual for me, but it's still 5+ seconds
to display or refresh notmuch-hello.)

Mark has written a patch that I've been using to switch notmuch-hello to
use batch count. That has made me switch from running notmuch in ssh to
using remote notmuch. The great thing is that we could switch to using
that in Emacs with no special casing for remote usage, and it would
speed things up also in local use. I'm expecting Mark to post his patch
in reply to this series.

Mark actually wrote the elisp part based on the rough idea prior to any
of this cli plumbing, so I felt obliged to follow up. So thanks Mark!


BR,
Jani.


[1] http://notmuchmail.org/remoteusage/ (the page could use some
cleanup; it's really not nearly as complicated as the page suggests)


Jani Nikula (5):
  cli: remove useless strdup
  cli: extract count printing to a separate function in notmuch count
  cli: add --batch option to notmuch count
  man: document notmuch count --batch and --input options
  test: notmuch count --batch and --input options

 man/man1/notmuch-count.1 |   20 +++++++++
 notmuch-count.c          |  111 +++++++++++++++++++++++++++++++++++-----------
 test/count               |   46 +++++++++++++++++++
 3 files changed, 150 insertions(+), 27 deletions(-)

-- 
1.7.10.4


Thread: