Jani Nikula <jani@nikula.org> writes: > On Sat, 03 Nov 2012, David Bremner <david@tethera.net> wrote: >> Offhand I'm not sure of a good method of automatically deciding what is >> the same message (with e.g. headers and footer text added by a mailing >> list). > > Assuming there was good method, what would you do with two different > messages that have the same message id? That is the unique id we use to > identify messages (which should be fine per RFC 5322 and its > predecessors; we're talking about messages from broken systems here). We're also talking about data from "untrusted" sources. Assuming that such data is always non-broken seems overly optimistic. (See e.g. http://cr.yp.to/immhf/thread.html, the section "Security and reliability issues" for one view on the matter.) In fact, I'd say that it should be a design goal for any mail client to deal with as much invalid input as possible. Show big, fat warning messages if you want to, but don't just drop the message and pretend it does not exist. eirik