Re: Reimagining notmuch-git/nmbug

Subject: Re: Reimagining notmuch-git/nmbug

Date: Mon, 03 Apr 2023 16:40:40 -0300

To: Felipe Contreras

Cc: notmuch@notmuchmail.org

From: David Bremner


David Bremner <david@tethera.net> writes:

> Indeed that speeds up the initial clone on this machine from 39 minutes
> (I switched machines) to 30s. I will play with it a bit more, and report
> back.

It's not a showstopper, but "git pull" takes about 1/2 the wall time
(about 2/3 of the CPU time) of the original clone, even if there is only
one tag changed.

Two potential improvements I can think of.

- notmuch-dump.c calls notmuch_query_set_sort (query,
  NOTMUCH_SORT_UNSORTED). I think I managed to do this (diff below),
  but performance gain was negligible. 

- Since you cache the lastmod value, you should be able to use it in a
  query. This does make a big difference in my experiments. I had to
  remove the 'deleteall' (otherwise only the changed messages are left
  in the git repo). I'm not 100% this is correct, hopefully you see
  quicker than I. In any case the lastmod query is what notmuch-git
  uses.

diff --git a/git-remote-nm b/git-remote-nm
index c668b38..cabea26 100755
--- a/git-remote-nm
+++ b/git-remote-nm
@@ -148,9 +148,11 @@ def wr_import(ref)
   wr_data("lastmod: %d\n" % ($lastmod || 0))
   wr_l 'from refs/notmuch/master^0' if $lastmod
 
-  wr_l 'deleteall'
+#  wr_l 'deleteall'
 
-  $db.query('').search_messages.each do |msg|
+  $query=$db.query("lastmod:%d.." % ($lastmod || 0) )
+  $query.sort=Notmuch::SORT_UNSORTED
+  $query.search_messages.each do |msg|
     hash = Blake2b.hex(msg.message_id, Blake2b::Key.none, 2)
     dir1, dir2 = hash[..1], hash[2..]
     wr_l 'M 644 inline %s/%s/%s/tags' % [dir1, dir2, encode_filename(msg.message_id)]

        
_______________________________________________
notmuch mailing list -- notmuch@notmuchmail.org
To unsubscribe send an email to notmuch-leave@notmuchmail.org

Thread: