Re: Handling mislabeled emails encoded with Windows-1252

Subject: Re: Handling mislabeled emails encoded with Windows-1252

Date: Tue, 24 Jul 2018 15:55:54 +0200

To: David Bremner,


From: Sebastian Poeplau

Hi again,

>> Everyone's mail situation is unique, but I haven't noticed this
>> problem. Do you have a mechanical (e.g. scripted) way of detecting such
>> mails? I suppose it could just look for characters in the range 0x80 to
>> 0x95 in allegedly ISO_8859-1 messages. A census of the situation in my
>> own mail would help me think about this problem, I think.
> Yes, I guess that should be a good enough heuristic for detecting
> affected mail. I'll try to come up with a simple script and post it
> here.

Attached is a Python script that checks individual message files and
prints their name if it finds them to contain mislabeled Windows-1252
text. The heuristic seems to work well on my mail - let me know if you
encounter any issues!

Sebastian (application/octet-stream)
notmuch mailing list