[notmuch] Notmuch performance (literally, in my case)

Subject: [notmuch] Notmuch performance (literally, in my case)

Date: Sun, 14 Mar 2010 22:59:28 -0700 (PDT)

To: notmuch

Cc:

From: Ben Gamari


Out of curiosity: How does notmuch perform for you?

I just started using it on a daily basis last week. I love the power that
notmuch provides and the emacs interface is surprisingly usable after you've
gotten used to it, but that being said, I've been quite surprised by how poorly
the kernel responds to the workload presented by notmuch.

Database updates in particular are awful; notmuch new will easily take 3
minutes to index 10 messages and for much of that time the machine will be
close to unusable: input device events are handled sporatically, music will
studder, applications will go unresponsive for tens of seconds at a time; the
traits usually associated with pagefile thrashing. Even something as simple as
saving a file or starting a terminal can take tens of seconds. Considering I
have notmuch new run in my crontab, this gets old quite quickly. It's really
quite awful.

As far as I can tell, this is a result of the horrendous behavior fsync()
invokes in the kernel. I find that performance also suffers in similar ways when
doing backups with rsync, which also seems to use fsync(). During these slow
periods, I/O wait time dominates top while disk throughput hovers at less than
1MByte/second. I have 4GB of memory and a fairly fast hard drive (for a laptop),
yet somehow the system is still barely usable. Meanwhile, latencytop shows large
amounts of time (sometimes 30 seconds or more) spent handling page faults.

Has anyone else observed similarly poor behavior? I am currently using
btrfs on this machine, although ext4 doesn't seem to be any better. Notmuch is
using xapian 1.08-1.99karmic from the Xapian backports PPA, which I believe
includes the recent database update optimizations.

I would really like to get to the bottom of this behavior. There have been many
attempts[1-8] in the past, but to this day the kernel still seems to
suffer under these sorts of workloads. Anyways just wondering if you all are
seeing similar issues. I've never had so reliable a means of reproducing these
latencies, but I think I might bring the issue to the LKML again if I get some
responses. Any feedback in either direction would be greatly appreciated.

Thanks!

- Ben


[1] http://bugzilla.kernel.org/show_bug.cgi?id=5900
[2] http://bugzilla.kernel.org/show_bug.cgi?id=7372
[3] http://lkml.org/lkml/2009/5/16/225
[4] http://lkml.org/lkml/2009/4/28/24
[5] http://lkml.org/lkml/2007/7/21/219
[6] http://lkml.org/lkml/2009/3/26/72
[7] http://lwn.net/Articles/328363/
[8] http://lkml.org/lkml/2009/4/6/114


Thread: