Re: Reimagining notmuch-git/nmbug

Subject: Re: Reimagining notmuch-git/nmbug

Date: Mon, 3 Apr 2023 11:01:19 -0500

To: David Bremner

Cc: notmuch@notmuchmail.org

From: Felipe Contreras


On Mon, Apr 3, 2023 at 4:49 AM David Bremner <david@tethera.net> wrote:

> Performance-wise the initial clone seems pretty slow. For my 600k
> messages I have been waiting a while now.  htop tells me that
> git-fast-import has about 45 minutes of CPU time at this point.  This
> machine is not that fast, but for comparison an initial (i.e. fresh
> repo, no caching) "notmuch git commit" takes about 15-20s.

I found the problem. If all the files are in the same directory, `git
fast-import` spends a lot of time comparing all the paths.

By distributing the files in multiple directories like notmuch-git
does using BLAKE2b, the operation is much faster.

I've pushed the changes, now there's a dependency, but you can just
`gem install blake2b`.

I'm able to clone the database of the performance corpus in 5 seconds:

% git clone --bare nm::$PWD/mail mail.git

Cheers.

-- 
Felipe Contreras
_______________________________________________
notmuch mailing list -- notmuch@notmuchmail.org
To unsubscribe send an email to notmuch-leave@notmuchmail.org

Thread: