subjects and duplicated message id's

Subject: subjects and duplicated message id's

Date: Thu, 14 Dec 2017 10:03:12 -0400



From: David Bremner

There are currently several somewhat related issues with notmuch's
handling of subject headers for messages with duplicate message-ids
(i.e. several files on disk with the same message id).  These are all
reflections of the fact that we use a value slot for subjects in the
database message document (i.e. the database object keyed by the
message-id).  Among other things, using a value slot is what makes
regular expression searching (and potentially sorting) by subject work.

When we have multiple files with the same message-id, but different
subjects (probably indicating a "real" mid collision).

1. The output of notmuch-show can be inconsistent with notmuch-search

   - this is because show is reading from the lexicographically first
     file, while show is reading the database value slot.

   - in principle this could be fixed by making show read the value
     slot; but then the subject might not be consistent with the rest of
     the message content. Also, it looks like a bit of a pain to refactor
     so all that sprinter code has database access.

   - we could also force the value slot to have the lexico first files'
     subject during indexing. This would be a bit fiddly, but localized.
     It would have the surprising effect of having the subject updated
     when new messages arrived.

2. Regular expression search doesn't work for subjects not in the value

   - this could be fixed by putting all subjects in the value slot,
     perhaps as newline seperated strings. This would also be a
     potential solution for the "subject hiding" issue mentioned above,
     although it would take some front-end effort as well to deal with
     "multi-subjects".  This could be reported in e.g. json output as an
     array of subjects.

I'm open to other, better ideas of how to do this. I'm also curious how
important people think these bugs are.

notmuch mailing list