Re: how to search for hyphenated words? (was: how to search for Morse code?)

Subject: Re: how to search for hyphenated words? (was: how to search for Morse code?)

Date: Fri, 08 Mar 2019 16:03:02 -0800

To: Gregor Zattler, notmuch@notmuchmail.org

Cc:

From: Carl Worth


Hi Gregor,

The trick here is that when notmuch is indexing body text it feeds it
into a Xapian function that parses the text by finding "terms" in the
text. And this parser considers both punctuation and whitespace as
separators between terms.

So your messages are not being indexed in a way to let you distinguish
between "org notmuch" and "org-notmuch".

(Of note, the query parser applies the same parsing to your query---so
that even when you think you're typing an exact phrase like
"org-notmuch" that gets parsed into separate terms "org" and "notmuch"
for searching.)

> all these resulted in very many hits most or all of which do not
> contain the string "org-notmuch", one found email was e.g.
>
> id:20180904105723.15564-3-david@tethera.net

That message does contain the following:

   +test_emacs '(notmuch-tree "id:000-real-root@example.org")
   +           (notmuch-test-wait)

Where you will notice that there's a term "org" followed (after some
punctuation and whitespace separators) by a term "notmuch".

> How would one search for hyphenated words with notmuch?

You would need to arrange to have the indexer consider the hyphen as a
letter-like character to be made part of terms. Or be extra clever and
index something like "notmuch-test-wait" in multiple ways (such as a
single term "notmuch-test-wait" as well as three adjacent terms
"notmuch", "test", and "wait" as notmuch is doing currently).

-Carl
signature.asc (application/pgp-signature)
_______________________________________________
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch

Thread: