On Wed, 16 Jan 2013, Tomi Ollila <tomi.ollila@iki.fi> wrote: > On Wed, Jan 16 2013, Mark Walters <markwalters1009@gmail.com> wrote: > >> On Tue, 15 Jan 2013, Jani Nikula <jani@nikula.org> wrote: >>> Hi all - >>> >>> Notmuch remote usage [1] is a pretty handy way of accessing a notmuch >>> database on a remote server. However, the more you have saved searches >>> and tags, the slower notmuch-hello becomes, and it ends up being by and >>> far the biggest usability issue with remote notmuch. This is because >>> notmuch-hello issues a separate 'notmuch count' for each saved search >>> and tag. >>> >>> One could argue that notmuch-hello should be fixed somehow, but I chose >>> to try another route: batch support for notmuch count. This enables >>> notmuch-hello to get the counts for all the saved searches or tags in a >>> single call. The performance improvement is huge in remote usage, but >>> it's not limited to that. Regular local usage benefits from it too, but >>> it's not as obviously noticeable. >> >> This series looks good to me (that is the code looks fine). >> >> Two questions are: >> >> Do we want this functionality? I think it is useful even on local setups >> particularly if people have lots of tags (the section that shows all >> tags can be quite noticeably sped up). It is a substantial improvement >> on remote setups but I am not sure if that is sufficiently common to >> warrant the change. At least the code path is the same so it will get >> enough testing. > > I do want the functionality. Especialy where I am now it takes about > 0.4 sec for 'ssh remote echo foo' to get executed (using connection sharing). > pipelining the count requests could make all the count requests emacs > does (in my current set) to complete in less than 1 sec. > >> Secondly, if we do the functionality should it be more general so that >> it can do searches etc too. I think this is less clear. Count is likely >> to be the most useful one since running several (simultaneous) counts is >> probably more common than running several simultaneous searches. > > One could argue that we'd should send json "documents" to notmuch in > stdin and notmuch would output json(/sexp) "documents". That is just > SMOP. I bet Austin would like this solution, especially the part > that involves writing or integrating json parser >;). > I'd be happy with this 'batch' approach. > > I'll be testing this soon, but refrain from reviewing the code > until 0.15 is out. id:87a9s5cp38.fsf@zancas.localnet ;) J. > >> >> Best wishes >> >> Mark > > > Tomi > > >> >> >>> >>> Here's a script that demonstrates one-by-one count vs. batch count, >>> locally and over ssh (assuming ssh key authentication is set up), over >>> 10 iterations: >>> >>> #!/bin/bash >>> >>> echo "tag count:" >>> notmuch search --output=tags "*" | wc -l >>> >>> for remote in "" "ssh example.com"; do >>> export remote >>> echo "one-by-one count:" >>> time sh -c 'for i in `seq 10`; do notmuch search --format=text0 --output=tags "*" | xargs -0 -n 1 -I "{}" $remote notmuch count tag:"{}" > /dev/null; done' >>> >>> echo "batch count:" >>> time sh -c 'for i in `seq 10`; do notmuch search --format=text --output=tags "*" | sed "s/.*/tag:\"\0\"/" | $remote notmuch count --batch > /dev/null; done' >>> done >>> >>> And here's the output of it in my setup: >>> >>> tag count: >>> 36 >>> one-by-one count: >>> >>> real 0m2.349s >>> user 0m0.552s >>> sys 0m0.868s >>> batch count: >>> >>> real 0m0.179s >>> user 0m0.120s >>> sys 0m0.064s >>> one-by-one count: >>> >>> real 0m56.527s >>> user 0m1.424s >>> sys 0m1.164s >>> batch count: >>> >>> real 0m2.407s >>> user 0m0.068s >>> sys 0m0.040s >>> >>> As can be seen, in local usage (the first pair of results) the speedup >>> is more than 10x, although one-by-one notmuch count is usually >>> sufficiently fast. The difference is more noticeable in remote use (the >>> second pair of results), where the speedup is 20x here, and any >>> additional, occasional network latency is multiplied by tag count. (That >>> result is actually faster than usual for me, but it's still 5+ seconds >>> to display or refresh notmuch-hello.) >>> >>> Mark has written a patch that I've been using to switch notmuch-hello to >>> use batch count. That has made me switch from running notmuch in ssh to >>> using remote notmuch. The great thing is that we could switch to using >>> that in Emacs with no special casing for remote usage, and it would >>> speed things up also in local use. I'm expecting Mark to post his patch >>> in reply to this series. >>> >>> Mark actually wrote the elisp part based on the rough idea prior to any >>> of this cli plumbing, so I felt obliged to follow up. So thanks Mark! >>> >>> >>> BR, >>> Jani. >>> >>> >>> [1] http://notmuchmail.org/remoteusage/ (the page could use some >>> cleanup; it's really not nearly as complicated as the page suggests) >>> >>> >>> Jani Nikula (5): >>> cli: remove useless strdup >>> cli: extract count printing to a separate function in notmuch count >>> cli: add --batch option to notmuch count >>> man: document notmuch count --batch and --input options >>> test: notmuch count --batch and --input options >>> >>> man/man1/notmuch-count.1 | 20 +++++++++ >>> notmuch-count.c | 111 +++++++++++++++++++++++++++++++++++----------- >>> test/count | 46 +++++++++++++++++++ >>> 3 files changed, 150 insertions(+), 27 deletions(-) >>> >>> -- >>> 1.7.10.4 >> _______________________________________________ >> notmuch mailing list >> notmuch@notmuchmail.org >> http://notmuchmail.org/mailman/listinfo/notmuch