These are mainly RFC because I'm not 100% sure about the performance impact. It seems OK for me: about 3% slower indexing my 500 K messages with about 35k duplicates. I didn't see a noticable increase in database size (both cases it's 5.8G / 3.5G before/after notmuch compact). There are also tons of UI issues: for example in the test case here, notmuch search subject:'"message 2"' will happily print thread:0000000000000001 2001-01-05 [1/1] Notmuch Test Suite; message 1 (inbox unread) I claim it's still an improvement over the current code, where that second message is not findable by any terms unique to it.