Re: Fwd: notmuch collects empty tags/labels

Subject: Re: Fwd: notmuch collects empty tags/labels

Date: Mon, 21 Oct 2024 15:21:08 -0300

To: prowess-alarm-much@duck.com, notmuch@notmuchmail.org

Cc:

From: David Bremner


prowess-alarm-much@duck.com writes:

> Dear developers and maintainers,
>
> I'm a fan of notmuch, but recently notmuch indexed 1679 messages with
> an empty tag, when I run: notmuch count -- tag:
>
> When I look for some of the messages in this list, I get messages that
> have some tags already, and messages with no tags, but somehow,
> notmuch seems to consider the existence of a "" tag. And the outcome
> is the same, whether I search for: tag:,tag:'', or even
> tag:"". Even together with other tags.

the TL;DR:

Unfortunately this is just the default query parser in Xapian (that
notmuch extends) falling back to looking for messages containing the
word tag. 

The longer explanation:

If you are curious, you can run

    $ NOTMUCH_DEBUG_QUERY=t notmuch count tag:""
    Query string is:
    tag:
    Exclude query is:
    Query((((Kspam OR Kdeleted) OR Kmuted) OR Kbad-address))
    Final query is:
    Query(((Tmail AND (Ztag@1 OR ZGtag@1 OR ZKtag@1 OR ZKtag@1 OR ZQtag@1 OR
    ZQtag@1 OR ZPtag@1 OR ZXPROPERTYtag@1 OR ZXFOLDER:tag@1 OR ZXFROMtag@1
    OR ZXTOtag@1 OR ZXATTACHMENTtag@1 OR ZXMIMETYPEtag@1 OR ZXSUBJECTtag@1
    OR ZXUList:tag@1)) AND_NOT (((Kspam OR Kdeleted) OR Kmuted) OR
    Kbad-address)))

It's probably not too easy to read, but if squint you can see a bunch of
prefixes other than K, which is tag.

In case of surprising results, it's also worth trying the sexp query
parser, as it is more likely to report errors, and in general handles
corner cases better (the flip side of being less DWIM, I guess).

NOTMUCH_DEBUG_QUERY=t notmuch count --query=sexp '(tag "")'

    Query string is:
    (tag "")
    Exclude query is:
    Query((((Kspam OR Kdeleted) OR Kmuted) OR Kbad-address))
    Final query is:
    Query(((Tmail AND (<alldocuments> AND K)) AND_NOT (((Kspam OR
    Kdeleted) OR Kmuted) OR Kbad-address)))

This query is easier to read, in particular K alone is the correct
translation for an empty tag. For me this gets the expected answer 0.
_______________________________________________
notmuch mailing list -- notmuch@notmuchmail.org
To unsubscribe send an email to notmuch-leave@notmuchmail.org

Thread: