Re: regex search in the body

Subject: Re: regex search in the body

Date: Wed, 23 Apr 2025 19:30:07 -0000 (UTC)

To: notmuch@notmuchmail.org

Cc:

From: Olly Betts


On 2025-04-02, Michael J Gruber wrote:
> I can't even find the form `prefix*` in the search term documentation
> (just `*`). SInce it's "a little bit of glob but not really" and we
> have regex searches now I would advocate for turning off `prefix*`
> because it's just confusing ("Oh, globbing works!" - "No, it
> doesn't.") and unsystematic (`*suffix` ...).

Xapian supports arbitrary use of `*` and `?` wildcards in git master
(search the "extended wildcard" in the API docs).  If you'd prefer to
use a stable release, we are steadily closing in on a new release series
which will have this feature.

However:

>> >> I would like to find all messages with the substring "identité":
>> >> - identité
>> >> - identités
>> >> - l'identité
>> >> - l’identité
>> >> - d'identité
>> >> - d’identité

Wildcards aren't really a good solution to the original problem.  A
query for `*identité*` works OK, but requires the user to know they need
to do that, and for other words it can result in false matches - for
example if you want to match `l'été`, etc and so search for `*été`
you'll match `anxiété`, `complété`, etc.

The French stemmer should know to remove elisions and then all of these
will just match a query for `identité` automatically.  That's actually
already implemented, though I haven't yet merged it to the Xapian tree.

Cheers,
    Olly

_______________________________________________
notmuch mailing list -- notmuch@notmuchmail.org
To unsubscribe send an email to notmuch-leave@notmuchmail.org

Thread: