notmuch command-line arguments are not <search-term>s

Subject: notmuch command-line arguments are not <search-term>s

Date: Tue, 25 Mar 2025 00:16:38 -0400

To: notmuch@notmuchmail.org

Cc:

From: Ian! D. Allen


Summary (TL;DR): The notmuch man page SYNOPSIS shows blank-separated command-line arguments labelled "<search-term>", but these are not search terms; these command-line arguments are merely pieces of arbitrary text concatenated together, separated by blanks, into one long query string that is then parsed for search terms.  Calling command-line arguments <search-term> is wrong.

Discussion:

As a Unix/Linux command line user since 1976 (not a typo), I have an understanding and expectations about how command-line arguments get used by commands, and notmuch search terms violate these expectations and lead to confusing an unintuitive results.

The notmuch man page SYNOPSIS illustration of a "<search-term>" is misleading:

     notmuch search [option ...] <search-term> ...

This SYNOPSIS means that each <search-term> is a separate blank-separated command-line argument, using the same SYNOPSIS syntax as every other Unix/Linux command with multiple arguments (e.g. "man ls", "man echo", etc.).

The notmuch man page claims that a <search-term> "can consist of free-form text (and  quoted  phrases) which will match" in messages, and that "Each term in the query will be implicitly connected by a logical AND".

The man page SYNOPSIS shows that separate command-line arguments are separate search terms, so that (1) a command line with a single argument (a single search term) must be different from (2) a command line with two blank-separated arguments (two search terms), which should itself be the same as if (3) the two search terms were joined by AND:

    1. $ notmuch search 'from:"ian allen"'
    2. $ notmuch search 'from:"ian' 'allen"'
    3. $ notmuch search 'from:"ian' AND 'allen"'

In fact, notmuch treats (1) and (2) as exactly the same.  The two command-line arguments are simply joined together with a blank, not with AND.  Despite what the man page says about implicit AND joining search terms, the two command-line arguments in (2) are not joined with AND to be the same as (3).

The SYNOPSIS section says each command-line argument is a <search-term>, but we can see that command line arguments are *not* search terms.

In fact, the notmuch parser doesn't consider that separate arguments on the command line are separate search terms; it lumps all the command-line arguments as strings together into one long query string and only then parses it for search terms.

For example, though the syntax below is of the form "<search-term> AND <search-term>", notmuch doesn't see it that way:

    4. $ notmuch search 'date:January' AND 'from:ian OR from:contact'
    5. $ notmuch search 'date:January' AND '( from:ian OR from:contact )'

The SYNOPSIS leads us to believe that (4) would be the same as (5), but, no, notmuch completely throws away the concept of separate command-line arguments, concatenates all the argument strings together separated by blanks, and parses the resulting long string as if it were one big search query, and of course the AND binds first, like this:

    6. $ notmuch search 'date:January AND from:ian OR from:contact'
    7. $ notmuch search '( date:January AND from:ian ) OR from:contact'

All of (4), (6), and (7) are the same query.  Splitting the query into what the man page would call separate search terms does nothing.  The SYNOPSIS <search-term> does *not* map onto command-line arguments.

This is not intuitive and not the way the man page explains it.

Because all the command-line arguments are simply pieces of text concatenated together with blanks between, the SYNOPSIS section is wrong and the man page description of a search term as an argument is also wrong.

I offer up this revised SYNOPSIS entry:

    notmuch search [option ...] <search-text> ...

with these added lines in the DESCRIPTION:

    All the blank-separated <search-text> command-line arguments are
    concatenated together, separated by blanks, into one long query
    string before search terms are identified.  There is no requirement
    that each command-line argument be a valid or complete search term
    or expression; only the final concatenated string is used.  You may
    construct a query using separate command-line argument fragments or
    quote everything as one single argument; using separate arguments
    does nothing special.

Because all the command-line <search-text> arguments are concatenated together before parsing starts, syntax errors in early command-line arguments will affect the parsing of later command-line arguments, which is also unintuitive.  This argument contamination is made worse by the failure of notmuch to alert users about syntax errors in queries.  You can waste a lot of time trying to refine a search term in a later command line argument, when the actual problem is a silent and unreported syntax error in the first command-line argument that contaminates the whole query.

For example, these three queries look like they should all return identical results, but they do not, because of some secret, silent syntax error in the from: search term:

    $ notmuch search date:January from:'("ian allen")' | wc
       2176   28870  242461
    $ notmuch search from:'("ian allen")' date:January | wc
         62     991    8056
    $ notmuch search from:'("ian allen")' date:January date:January | wc
          1      28     237

I don't see any way to get notmuch to report these syntax errors.  You have to look at the NOTMUCH_DEBUG_QUERY output to see the mess that results:

Query string is:
from:("ian allen") date:January date:January
Exclude query is:
Query((Kdeleted OR Kspam))
Final query is:
Query(((Tmail AND (((allen@1 OR Gallen@1 OR Kallen@1 OR Kallen@1 OR Qallen@1 OR Qallen@1 OR Pallen@1 OR XPROPERTYallen@1 OR XFOLDER:allen@1 OR XFROMallen@1 OR XTOallen@1 OR XATTACHMENTallen@1 OR XMIMETYPEallen@1 OR XSUBJECTallen@1) AND ((date@2 PHRASE 4 january@3 PHRASE 4 date@4 PHRASE 4 january@5) OR (Gdate@2 PHRASE 4 Gjanuary@3 PHRASE 4 Gdate@4 PHRASE 4 Gjanuary@5) OR (Kdate@2 PHRASE 4 Kjanuary@3 PHRASE 4 Kdate@4 PHRASE 4 Kjanuary@5) OR (Kdate@2 PHRASE 4 Kjanuary@3 PHRASE 4 Kdate@4 PHRASE 4 Kjanuary@5) OR (Qdate@2 PHRASE 4 Qjanuary@3 PHRASE 4 Qdate@4 PHRASE 4 Qjanuary@5) OR (Qdate@2 PHRASE 4 Qjanuary@3 PHRASE 4 Qdate@4 PHRASE 4 Qjanuary@5) OR (Pdate@2 PHRASE 4 Pjanuary@3 PHRASE 4 Pdate@4 PHRASE 4 Pjanuary@5) OR (XPROPERTYdate@2 PHRASE 4 XPROPERTYjanuary@3 PHRASE 4 XPROPERTYdate@4 PHRASE 4 XPROPERTYjanuary@5) OR (XFOLDER:date@2 PHRASE 4 XFOLDER:january@3 PHRASE 4 XFOLDER:date@4 PHRASE 4 XFOLDER:january@5) OR (XFROMdate@2 PHRASE 4 XFROMjanuary@3 PHRASE 4 XFROMdate@4 PHRASE 4 XFROMjanuary@5) OR (XTOdate@2 PHRASE 4 XTOjanuary@3 PHRASE 4 XTOdate@4 PHRASE 4 XTOjanuary@5) OR (XATTACHMENTdate@2 PHRASE 4 XATTACHMENTjanuary@3 PHRASE 4 XATTACHMENTdate@4 PHRASE 4 XATTACHMENTjanuary@5) OR (XMIMETYPEdate@2 PHRASE 4 XMIMETYPEjanuary@3 PHRASE 4 XMIMETYPEdate@4 PHRASE 4 XMIMETYPEjanuary@5) OR (XSUBJECTdate@2 PHRASE 4 XSUBJECTjanuary@3 PHRASE 4 XSUBJECTdate@4 PHRASE 4 XSUBJECTjanuary@5))) FILTER XFROMian@1)) AND_NOT (Kdeleted OR Kspam)))
Query string is:
thread:000000000009ff3d
Exclude query is:
Query()
Final query is:
Query((Tmail AND 0 * G000000000009ff3d))

-- 
| Ian! D. Allen, BA-Psych, MMath-CompSci  idallen@idallen.ca Ottawa CANADA
| Home: www.idallen.com  Contact Improvisation Dance: www.contactimprov.ca
| Former college professor of Free/Libre GNU+Linux @ teaching.idallen.com
| Improve democracy www.fairvote.ca and defend digital freedom www.eff.org
_______________________________________________
notmuch mailing list -- notmuch@notmuchmail.org
To unsubscribe send an email to notmuch-leave@notmuchmail.org

Thread: