Re: Proof of concept for counting messages in thread

Subject: Re: Proof of concept for counting messages in thread

Date: Mon, 13 Feb 2023 21:47:24 -0400

To: Michael J Gruber

Cc: notmuch@notmuchmail.org

From: David Bremner


Michael J Gruber <michaeljgruber+grubix+git@gmail.com> writes:

> That is really weird:
> ```
> xapian-delve -t G0000000000021229 .
> Posting List for term 'G0000000000021229' (termfreq 115, collfreq 0,
> wdf_max 0): 146259 ...
> ```
> with 115 record numbers, all different.
> Doing `xapian-delve -1r` for each of them and grepping for the G-lines
> gives 115 times that correct thread id.
> Grepping for the Q-lines and notmuch-searching for the message ids
> gives only 5 results (the expected ones). Apparantly, there are bogus
> mail records which that thread points to.

1) Do those "bogus" records have a "Tghost" term? That would be for
messages that are known via references, but not actually in the local
database. This is a bug / feature of the current implementation, it
counts all messages known, whether or not local copies exist.

2) Do they have more than one G term? That suggests a bug somewhere. We
actually have a test in the test suite [1] for that, but of course that is
with a simple artificial database. 

[1]: in T670-duplicate-mid.sh:

db=$HOME/.local/share/notmuch/default/xapian
for doc in $(xapian-delve -1 -t '' "$db" | grep '^[1-9]'); do
    xapian-delve -1 -r "$doc" "$db" | grep -c '^G'
done > OUTPUT.raw
sort -u < OUTPUT.raw > OUTPUT
_______________________________________________
notmuch mailing list -- notmuch@notmuchmail.org
To unsubscribe send an email to notmuch-leave@notmuchmail.org

Thread: