Re: locales and notmuch

Subject: Re: locales and notmuch

Date: Sat, 23 Feb 2019 07:43:58 -0400

To: Matt Armstrong,


From: David Bremner

Matt Armstrong <> writes:

> Notmuch should probably adopt a coherent strategy with respect to
> character set encodings, rather than do something ad-hoc for the
> feature.  Most systems I have worked with normalize to UTF-8 at the
> edges and do all work using that encoding.

You're probably correct. On the other hand, lack of locale handling is not
something that people actually complain about very much. So if we do
decide to "Do the right thing", then I'd probably just continue ignoring
the problem, rather than block working on things that do annoy people.

> It is an interesting question: what encoding does .notmuch-config use?
> UTF-8?  User's choice?

It's loaded by g_key_file_load_from_data; I suspect that does no conversion.

> Similarly, what is the encoding of notmuch's
> command line args?

There is no conversion done.

In both these cases it probably works mostly OK for people (at least
nobody complained) because user values are treated as opaque null
terminated byte sequences.

> I was just reading and Xapian seems to store
> text in UTF-8.  If this is the case, where is the code that does the
> charset conversions between the email messages and UTF-8?

I'd have to double check the code to be sure, but I suspect this is done
by GMime when parsing the files.

> How about
> between the command line args to UTF-8?

AFAIR, there is no conversion, and search terms are passed straight to

This probably doesn't work well for people with non-UTF-8 locales.
notmuch mailing list