Re: UTF-8 in mail headers (namely FROM) sent by bugzilla

Subject: Re: UTF-8 in mail headers (namely FROM) sent by bugzilla

Date: Fri, 26 Jul 2013 12:16:21 +0200

To: David Bremner, Franz Fellner, notmuch@notmuchmail.org

Cc:

From: Jani Nikula


On Tue, 23 Jul 2013, David Bremner <david@tethera.net> wrote:
> Franz Fellner <alpine.art.de@gmail.com> writes:
>
>>
>> OK, thx. So every app needs to get patched to display those strings
>> properly? Any chance this could be done directly in libnotmuch?  I
>> grepped for "2047" inside te "emacs" subtree, but found nothing (had
>> the hope for a comment for the workaround). Would be interesting to
>> see how this is done, so I can at least try to create a patch (though
>> my ruby is quite basic).
>
> In general notmuch relies on libgmime for rfc2047 parsing.  I'm not sure
> of all the details now, but some of the filtering does happen in the
> CLI, not the lib.  You could start by looking at
> gmime-filter-headers.[ch] in the top directory.

I'm experiencing a similar problem with the Subject: headers in bugzilla
mail. Per RFC 2047,

    Ordinary ASCII text and 'encoded-word's may appear together in the
    same header field.  However, an 'encoded-word' that appears in a
    header field defined as '*text' MUST be separated from any adjacent
    'encoded-word' or 'text' by 'linear-white-space'.

In the problematic mails, the encoded-word begins immediately after
preceding text, i.e. without linear-white-space. Manually adding that
space in the message file makes the subject display as expected.

The decoding is done in the cli using g_mime_message_get_subject(). I'm
not sure if there's much that can be done about it within notmuch.

BR,
Jani.

Thread: