On Fri, 09 Aug 2013, stedfast@comcast.net wrote: > Hi guys, > > ( I'm the author of GMime for those that don't know) > > I just came across the notmuch thread (with the referenced Subject) > but unfortunately am not subscribed to the mailing list and so am > unable to reply to the list (hopefully no one minds me emailing them > directly!). I wanted to reach out and offer a possible solution to the > problem being discussed. Thanks for your mail; hopefully you don't mind me replying to the list! > Passing the GMIME_ENABLE_RFC2047_WORKAROUNDS flag to g_mime_init() > *should* solve the decoding problem mentioned in the thread. This flag > should be safe to pass into g_mime_init() without any bad side effects > and my unit tests do test that code-path. Many thanks, this solves my issue with the subject lines. This is the quick patch I tried: diff --git a/notmuch.c b/notmuch.c index 78d29a8..7300c21 100644 --- a/notmuch.c +++ b/notmuch.c @@ -264,7 +264,7 @@ main (int argc, char *argv[]) local = talloc_new (NULL); - g_mime_init (0); + g_mime_init (GMIME_ENABLE_RFC2047_WORKAROUNDS); #if !GLIB_CHECK_VERSION(2, 35, 1) g_type_init (); #endif We'll need to look into using this in the lib too. BR, Jani. > I took a look at gmime-filter-headers.[c,h] as well and I suspect that > it was written back when GMime brokenly did not guarantee UTF-8 > decoded strings from functions like g_mime_message_get_subject() and > the like. This was fixed a while back. From a quick grep of the > ChangeLog it looks like this was probably fixed in 2.5.9 or so (but > possibly as late as 2.6.3 as there were some other charset rfc2047 > decoder fixes around then). > > I know for sure that the 2.4.x series didn't guarantee UTF-8-safe > strings, but it's been the goal of 2.6.x to make that guarantee (minus > any bugs that may exist, but if you find any cases of that, let me > know!) > > (Note: raw header values from g_mime_object_get_header() are not > guaranteed to be UTF-8 but if you call > g_mime_utils_header_decode_text/phrase() on them, the results are > guaranteed to be valid UTF-8) > > Hope that helps, > > Jeff