On Mon, Nov 13, 2017 at 09:22:36AM -0400, David Bremner wrote: > The other thing I don't know is how many people would be happy with just > stripping all accents. That could be done in a gmime filter, as you > suggest. That would be more likely to require changes to the query > language. Off hand I don't know how to transparently de-accent all query > words. My gut feeling is that removing accents by default from both the terms in the index and user queries would go a long way in addressing this problem. Especially so if it's a boolean option in notmuch config (which default to stripping accents). As a random example/data point, chromium does that and when you search unaccented strings in a web page will find any combination of them with accents. Is, by far, my best UX experience w.r.t. accents on GNU/Linux. Unicode has a notion of canonical form that rearrange accented characters in a sequence of non-accented characters + modifiers https://en.wikipedia.org/wiki/Unicode_equivalence . A bunch of libraries use that stuff to normalize-away accents in unicode strings. I'm aware of a few in Python for instance, but not in C++ (which I believe is what you'd be interested in). HTH, -- Stefano Zacchiroli . zack@upsilon.cc . upsilon.cc/zack . . o . . . o . o Computer Science Professor . CTO Software Heritage . . . . . o . . . o o Former Debian Project Leader & OSI Board Director . . . o o o . . . o . « the first rule of tautology club is the first rule of tautology club » _______________________________________________ notmuch mailing list notmuch@notmuchmail.org https://notmuchmail.org/mailman/listinfo/notmuch