performance on long encrypted+signed threads

Subject: performance on long encrypted+signed threads

Date: Mon, 30 Sep 2019 22:10:39 -0400

To: Notmuch Mail

Cc:

From: Daniel Kahn Gillmor


Hi notmuch folks--

I just wanted to note a performance problem that i'm seeing with
notmuch.  I've attached a demonstration script
("crypto-performance-test"), which sets up a temporary GnuPG homedir and
notmuch installation, adds 20 encrypted+signed messages in a thread
(using indexed cleartext with stashed session IDs), and then tries to
render the thread.  It compares the performance of this rendering with
manual extraction of the ciphertext, fed in a loop to manual invocations
of gpg.


Problem Statement
-----------------

On a long thread like this where each message is both signed and
encrypted, "notmuch show" takes a long time to emit json ouput, even if
all the session keys for the messages are stashed.

I believe the bulk of the time spent rendering this is signature
verification.

I also note that if a comparable test is run with a large keyring, the
duration increases as the keyring increases, but this test doesn't show
that particular behavior.

$ ./crypto-performance-test
gpg: key 18523A27026D55EB marked as ultimately trusted
Found 0 total files (that's not much mail).
No new mail.
gpg: checking the trustdb
gpg: marginals needed: 3  completes needed: 1  trust model: pgp
gpg: depth: 0  valid:   1  signed:   0  trust: 0-, 0q, 0n, 0m, 0f, 1u
gpg: next trustdb check due at 2021-09-30
Processed 20 total files in 2s (8 files/sec.).
Added 20 new messages to the database.

performance of notmuch show:
real	0m1.582s
user	0m0.049s
sys	0m0.042s

performance of piped gpg:
real	0m1.110s
user	0m1.164s
sys	0m0.200s

performance of piped gpg --no-keyring:
real	0m0.249s
user	0m0.305s
sys	0m0.175s
$

note in particular the large delay for notmuch between wall-clock time
("real") and userspace ("user") and kernel ("sys") CPU time.

Diagnosis
---------

If i hack up gmime to disable signature verification (see
id:87v9t9xw0l.fsf@fifthhorseman.net over on gmime-devel-list@gnome.org,
which effectively adds the --no-keyring option to gpgme's gpg
invocation), then i can reduce the kernel time and the overall delay:

$ ./crypto-performance-test
gpg: key B9F59627422A84DB marked as ultimately trusted
Found 0 total files (that's not much mail).
No new mail.
gpg: checking the trustdb
gpg: marginals needed: 3  completes needed: 1  trust model: pgp
gpg: depth: 0  valid:   1  signed:   0  trust: 0-, 0q, 0n, 0m, 0f, 1u
gpg: next trustdb check due at 2021-09-30
Processed 20 total files in 1s (11 files/sec.).
Added 20 new messages to the database.

performance of notmuch show:
real	0m0.161s
user	0m0.037s
sys	0m0.017s

performance of piped gpg:
real	0m1.083s
user	0m1.158s
sys	0m0.176s

performance of piped gpg --no-keyring:
real	0m0.247s
user	0m0.301s
sys	0m0.177s
$

But there even so, there is a real difference between the CPU time and
the wall clock time.  The system i've run these tests on is not starved
for CPU or RAM, and it is not I/O bound.  Even if there were disk
troubles, the filesystem i'm running it on (where mktemp -d creates the
working directory) is a tmpfs.

My current guess is that bash's "time" built-in isn't tabulating all
user+sys CPU time actually used by notmuch because gmime's gpgme
invocations might be detaching its child processes entirely
(i.e. double-fork, orphaning the grandchild). The gpg processes that
appear during the "notmuch show" runs have parent pid 1.

How to fix
----------

I think the right thing to do here is related to stashing signature
verifications when they are first encountered, but i foresee trouble
with that approach given notmuch's willingness to collapse messages that
have the same message ID.

That is: we want to stash the signature verification the first time we
view it, and when displaying a message, we want to show the original
signature verification (rather than computing it again), *as long as*
the message has not changed since the previous verification.

We can't do this at all right now until GMime gains the capability, but
even if that happens i don't know how to do it safely.

Any suggestion on how we ensure that a cached verification applies to
all parts of the message being rendered via "notmuch show" ?

    --dkg

signature.asc (application/pgp-signature)
_______________________________________________
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch

Thread: