On 2019-12-04 13:09:03, Daniel Kahn Gillmor wrote: > Thanks for raising this, Anarcat! > > One more advantage that i think you haven't noted yet about regular > database compaction: > > "notmuch compact" tends to get rid of a lot of lingering written data > that is no longer referenced. While this isn't robust "secure > deletion", it's a lot better than not compacting. see > https://trac.xapian.org/ticket/742 for more discussion. Cool. > Some questions below… > > On Sun 2019-12-01 15:52:19 -0500, Antoine Beaupré wrote: > >> Thanks to Bremner, I just realized that notmuch-compact(1) is a thing, >> and that thing allows me to compress my notmuch databases by about 50%. > > do you know why you get the large size/speed gain? Not sure, but if I'd venture a guess: I never ran notmuch-compact(1) as far as I can remember. > do you regularly delete files from your message archive? Yes, all the time. I have had `d` mapped to `+deleted` basically forever, and have a pre-new hook that actually deletes those messages from this. Yes, I am an heretic. ;) >> So I whipped together two systemd units (attached) that will run that >> command every month on my notmuch database. Just drop them in >> `~/.config/systemd/user/` and run: >> >> systemctl --user daemon-reload >> systemctl --user enable notmuch-compact.timer >> systemctl --user start notmuch-compact.timer > > ("systemctl --user enable --now notmuch-compact.timer" will suffice for > the final two commands on any reasonably modern version of systemd) Whoa. TIL. > How long does it take for these the notmuch-compact.service to complete? I don't remember... it took less than a minute at the first run, I think. > What happens if this is happening when, say, you put your machine to > sleep, or you power it down? No idea. I think it's an atomic process as notmuch-compact(1) says: The compacted database is built in a temporary directory and is later moved into the place of the origin database. The original uncompacted database is discarded, unless the --backup=<directory> option is used. > While notmuch-compact.service is running, does "notmuch new" or "notmuch > insert" work? If not, how do they fail (e.g. blocking indefinitely, > returning a comprehensible error message)? No idea. Manpage says: Note that the database write lock will be held during the compaction process (which may be quite long) to protect data integrity. > Can you read your mail while notmuch-compact.service is running? I don't see why not, but I haven't tried. Considering I run it once a week, it would seem like a small tradeoff if that would cause problems anyways. >> Maybe those could be shipped with the Debian package somehow? Not sure >> how that works, but I think that's how gpg-agent gets started now, if >> you want any inspiration... > > gpg-agent is socket-activated, which is different from the > timer-activation you are proposing here. I thought about socket activation, but I don't think it would work in this case. > We could easily ship these systemd user unit files in the notmuch > package now that #764678 is resolved. Do you think that the timer > should be enabled by default? Sure, I don't see why not, unless we have concerns about notmuch-compact(1) being unsafe or counter-productive. > What should happen if the user hasn't set up notmuch? Maybe we need a > ConditionPathExists= or something like that on either the .timer or the > .service? Maybe: ConditionPathExists=$HOME/.notmuch-config ? > Do we expect this to run even when the user isn't logged in at all (a > background compaction?) Maybe not? No idea. > it always gets more complex when you think about trying to do it at > scale :) Yes. >> It would be great if notmuch-new ran this on its own, when it >> thought that this was "important", somehow like git-gc sometimes runs on >> its own. > > I'm not convinced i like this idea without more profiling and an > understanding of what it might cause. I have grown to *really* dislike > the highly variable latency and warnings caused by GnuPG's > "auto-check-trustdb", for example (especially as the keyring grows > larger). Again, tradeoffs: I prefer to have my trustdb actually checked once in a while (right?) and not pay that latency cost at some random gpg invocation (which seems to happen all the time). So I disable the built-in, inline checks and queue them in a timer instead. >> [ notmuch-compact.timer: text/plain ] >> [Unit] >> Description=compact the notmuch database > > systemd timer unit descriptions typically include some mention of the > duration. See for example: > > /lib/systemd/system/systemd-tmpfiles-clean.timer > "Daily Cleanup of Temporary Directories" > > /lib/systemd/system/certbot.timer > "Run certbot twice daily" > > /lib/systemd/system/phpsessionclean.timer > "Clean PHP session files every 30 mins" > > I recommend: > > Description=Compact the notmuch database every month Cool. >> [ notmuch-compact.service: text/plain ] >> [Unit] >> Description=compact the notmuch database > > The convention is to lead with an upper-case letter: > > Description=Compact the notmuch database Yay! > OK OK enough with the nit-picking! Thanks for the review! a. -- L'adversaire d'une vraie liberté est un désir excessif de sécurité. - Jean de la Fontaine _______________________________________________ notmuch mailing list notmuch@notmuchmail.org https://notmuchmail.org/mailman/listinfo/notmuch