WARNING: reindexing is an intrusive operation. I don't think this will corrupt your database, but previous versions thrashed threading pretty well. notmuch-dump is your friend. [PATCH 01/10] lib: isolate n_d_add_message and helper functions into [PATCH 02/10] lib/n_d_add_message: refactor test for new/ghost [PATCH 03/10] lib: factor out message-id parsing to separate file. [PATCH 04/10] lib: refactor notmuch_database_add_message header The first 4 patches are just code movement. database.cc has gotten to large to understand (for me), so this is mainly trying to group functions together in some logical way. [PATCH 06/10] lib: index message files with duplicate message-ids the diff here has grown a bit, but the idea is still simple: add terms and values for all files with a given message id. [PATCH 07/10] WIP: Add message count to summary output This patch gives the user some hints about the existance of multiple files per message-id. [PATCH 08/10] lib: add _notmuch_message_remove_indexed_terms this just iterates over terms, and kills any that are recoverable [PATCH 09/10] lib: add notmuch_message_reindex this is the trickiest code here, and it ends up using several of the functions called by notmuch_database_add_message, rather than calling it directly. [PATCH 10/10] add "notmuch reindex" subcommand This should probably have at least a few more tests: in particular preservation of message properties is not tested yet. Also, more tests involving threading are needed, since it turned out to surprisingly hard to trigger some bugs (i.e. there were bugs triggered only by one of the two corpora, and only by one of xapian 1.2 vs 1.4). The good news is that there really seems to be a speed payoff for this extra complication. reindexing all messages went from about twice as long the initial notmuch new, to about 60% of that speed. I'm a little skeptical about the peak memory use, but so far I didn't see any serious looking memory leaks.