Re: [patch v3 06/12] lib: index message files with duplicate message-ids

Subject: Re: [patch v3 06/12] lib: index message files with duplicate message-ids

Date: Fri, 09 Jun 2017 07:57:46 -0300

To: Daniel Kahn Gillmor,,


From: David Bremner

David Bremner <> writes:

> Daniel Kahn Gillmor <> writes:

>> for example, i could follow up on the current message with another
>> message with Message-Id: and
>> give it a subject "Re: [patch v3 06/12] lib: do *not* index message
>> files with duplicate message-ids".  that's a bit odd, no?
> Yes, I agree that's a bit strange.  We should make some effort to
> display the subject that belongs with a given message body. I think it's
> not too hard [1] to preserve the old behaviour of keeping the first
> subject, date, and from. This leaves us with a version of the original
> hiding message attack, but only for the special case of regex searches,
> since those rely exclusively on the value slots.

I had a slightly radical idea for how to deal with that. Subject/from
from extra files could be appended to the value slot (e.g. separated by
newlines). Then regexp searches would behave similarly to term based
searches in that matching any file would match the message. We'd have to
be slightly careful about what anchors meant.  A further enhancement
would be to expose the search result as an array. This kind of approach
doesn't really make sense for dates, as we essentially search for those
as numbers, and such a hack would break the built-in xapian range