Re: [PATCH] test: Add test for searching of uncommonly encoded messages

Subject: Re: [PATCH] test: Add test for searching of uncommonly encoded messages

Date: Sat, 25 Feb 2012 12:36:00 +0400

To: notmuch@notmuchmail.org

Cc:

From: Serge Z


Hi!
I've struck another problem:

I've got an html/text email with body encoded with cp1251.
Its encoding is mentioned in both Content-type: email header and html <meta>
tag. So when the client tries to display it with external html2text converter,
The message is decoded twice: first by client, second by html2text (I use w3m).

As I understand, notmuch (while indexing this message) decodes it once and
index it in the right way (though including html tags to index). But what if
the message contains no "charset" option in Content-Type email header but
contain <meta> content-type tag with charset noted? Should such message be
considered as being composed wrong or it should be indexed with diving into
html details (content-type)?


Thread: