I have been able to speed that up with the code below - basically increase "XAPIAN_FLUSH_THRESHOLD" based on the total virtual memory divided by the avg. size of an email times 2 (just to be safe). It seems to be faster since it does less xapian updates. However, I have a nagging feeling that ""XAPIAN_FLUSH_THRESHOLD" could even be higher since I don't see any increase in used memory (via "top -d 1"). The server in question has eight CPU cores and 8GB RAM, running Debian squeeze on a 32bit architecture (I know - but it is what it is :) ). # Assume an average size of 120KB per email # and use at most half the virtual memory XFT=$(($(free -otk | awk '/^Total/ {print $2}') / 240)) # Keep more index info in memory before flushing to disk [ $XFT -lt 10000 ] && XFT=10000 su - archive -c "export XAPIAN_FLUSH_THRESHOLD=$XFT; notmuch new --verbose" ----- Original Message ----- > From: Felipe Contreras <felipe.contreras@gmail.com> > To: Tom Bulli <mrbulli@yahoo.com> > Cc: "notmuch@notmuchmail.org" <notmuch@notmuchmail.org> > Sent: Wednesday, November 23, 2011 10:40 AM > Subject: Re: Notmuch indexing 21 million emails > > On Tue, Nov 22, 2011 at 5:02 AM, Tom Bulli <mrbulli@yahoo.com> wrote: >> I have a project where I need to search about 21 emails - and decided to > use "notmuch" for it. The system is a Debian Squeeze, the notmuch > version is "0.8-1~bpo60+1" from "kyria's" private > repository. >> >> I am running the "notmuch new" for approx. 4 days now - and > according to "not,uch count" it has indexed about 4.5 million emails. >> >> Is this expected performance? Is there any way to speed that up? > > It would be nice to run something like this with OProfile (or perf) > and see if there's some obvious fixes. > > -- > Felipe Contreras >