Re: compacting the notmuch database through systemd

Subject: Re: compacting the notmuch database through systemd

Date: Wed, 04 Dec 2019 13:09:03 -0500

To: Antoine Beaupré, notmuch@notmuchmail.org

Cc:

From: Daniel Kahn Gillmor


Thanks for raising this, Anarcat!

One more advantage that i think you haven't noted yet about regular
database compaction:

"notmuch compact" tends to get rid of a lot of lingering written data
that is no longer referenced.  While this isn't robust "secure
deletion", it's a lot better than not compacting.  see
https://trac.xapian.org/ticket/742 for more discussion.

Some questions below…

On Sun 2019-12-01 15:52:19 -0500, Antoine Beaupré wrote:

> Thanks to Bremner, I just realized that notmuch-compact(1) is a thing,
> and that thing allows me to compress my notmuch databases by about 50%.

do you know why you get the large size/speed gain?  do you regularly
delete files from your message archive?

> So I whipped together two systemd units (attached) that will run that
> command every month on my notmuch database. Just drop them in
> `~/.config/systemd/user/` and run:
>
>     systemctl --user daemon-reload
>     systemctl --user enable notmuch-compact.timer
>     systemctl --user start notmuch-compact.timer

("systemctl --user enable --now notmuch-compact.timer" will suffice for
the final two commands on any reasonably modern version of systemd)

How long does it take for these the notmuch-compact.service to complete?

What happens if this is happening when, say, you put your machine to
sleep, or you power it down?

While notmuch-compact.service is running, does "notmuch new" or "notmuch
insert" work?  If not, how do they fail (e.g. blocking indefinitely,
returning a comprehensible error message)?

Can you read your mail while notmuch-compact.service is running?

> Maybe those could be shipped with the Debian package somehow? Not sure
> how that works, but I think that's how gpg-agent gets started now, if
> you want any inspiration...

gpg-agent is socket-activated, which is different from the
timer-activation you are proposing here.

We could easily ship these systemd user unit files in the notmuch
package now that #764678 is resolved.  Do you think that the timer
should be enabled by default?

What should happen if the user hasn't set up notmuch?  Maybe we need a
ConditionPathExists= or something like that on either the .timer or the
.service?

Do we expect this to run even when the user isn't logged in at all (a
background compaction?)

it always gets more complex when you think about trying to do it at
scale :)

> It would be great if notmuch-new ran this on its own, when it
> thought that this was "important", somehow like git-gc sometimes runs on
> its own.

I'm not convinced i like this idea without more profiling and an
understanding of what it might cause.  I have grown to *really* dislike
the highly variable latency and warnings caused by GnuPG's
"auto-check-trustdb", for example (especially as the keyring grows
larger).

>  [ notmuch-compact.timer: text/plain ]
>  [Unit]
>  Description=compact the notmuch database

systemd timer unit descriptions typically include some mention of the
duration.  See for example:

/lib/systemd/system/systemd-tmpfiles-clean.timer
"Daily Cleanup of Temporary Directories"

/lib/systemd/system/certbot.timer
"Run certbot twice daily"

/lib/systemd/system/phpsessionclean.timer
"Clean PHP session files every 30 mins"

I recommend:

    Description=Compact the notmuch database every month

> [ notmuch-compact.service: text/plain ]
> [Unit]
> Description=compact the notmuch database

The convention is to lead with an upper-case letter:

    Description=Compact the notmuch database

OK OK enough with the nit-picking!

Happy hacking,

    --dkg
signature.asc (application/pgp-signature)
_______________________________________________
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch

Thread: