Re: viewing duplicate messages

Subject: Re: viewing duplicate messages

Date: Sun, 18 Aug 2019 16:20:08 -0400

To: Jorge P. de Morais Neto, Notmuch Mail


From: Daniel Kahn Gillmor

On Sat 2019-08-17 19:12:26 -0300, Jorge P. de Morais Neto wrote:
> I have attached a tarball with three homonymous messages from Dell.  The
> last (most recent) two have the same subject and bodies, but the first
> (earliest) one is different and yet they all have Message-Id 1. I have
> included the Notmuch list as a recipient because the tarball is a mere
> 11252B.

thanks for this.  Looking at the headers, it occurs to me that the
problem might actually be that Dell ("")
might not including a message-id header at all, and it is being added
their IronPort/Sophos AV client as it passes through their mail system.

I suspect this possibility because the placement of the Message-ID
header itself is supiciiously high up in the list of headers (it looks
like it might have been placd there by the initial relaying MTA, rather
than the MUA).

If this is the case, it could be solved in one of two ways: they could
inject a proper unique Message-ID before handing the message off to
IronPort; or they could fix their IronPort appliance to inject a proper
unique Message-ID header.

That's all about fixing it on the sender side though.  Are there
possible fixes on the receiving side?

one thought is that notmuch could treat an obviously low-entropy
message-ID the same way that it treats a message with no Message-ID at
all.  Of course, that raises the question: what is a low-entropy message
ID? A single-character message-id is pretty clearly too low-entropy to
be useful, but if we said "1-character long" was too short, it would at
least avoid this particular mistake.

i also note that NEWS claims (in the section for notmuch 0.17) that
notmuch treats "overlong" message-ids in the same way as missing
message-ids, but i don't see where that distinction is done in the code.
It doesn't appear to be in lib/message-file.c, where the notmuch-sha1-*
generation is done.  But anyway, if we are treating "overlong"
message-ids as missing, it's nicely symmetric to treat "overshort"
message-ids in the same way.

signature.asc (application/pgp-signature)
notmuch mailing list