On Thu, 7 Jul 2011 12:37:00 +0100, Patrick Totzke <patricktotzke@googlemail.com> wrote:
> Hi!
> Something strange goes on when I use unicode literals as querystrings:
> Database().create_query(u'teststring') yields different results than
> Database().create_query('teststring')..
>
> Now it should not be a problem to decode the string to whatever encoding
> is used by notmuch/xapian internally using 'teststring'.encode('utf8')
> for example. But can I reliably expect all strings in the index to be valid utf8?
>
> At any rate, I think this conversion should be made from inside the bindings.
> A query should return the same results for querystrings as string- and unicode literals.
> Any thoughts?
I hate encodings and they always confuse the heck out of me. I would
prefer if everything was always UTF8. notmuch.h actually doesn't state
which encoding the query string should be and neither did
http://xapian.org/docs/queryparser.html. ojwb said, it takes UTF-8, so
that's what we should be doing.
I'll send a patch as a reply shortly, Patrick, do you care to test if
this fixes things for you?
Sebastian