Re: 'notmuch search thread:<>' lists multiple threads

Subject: Re: 'notmuch search thread:<>' lists multiple threads

Date: Sun, 08 Apr 2018 00:04:35 -0300

To: Naveen N. Rao,


From: David Bremner

"Naveen N. Rao" <> writes:

> Greetings--
> If I search for threads matching a specific thread-id, I am seeing 
> multiple results:
> $ notmuch search --output=threads thread:00000000000c4d20
> thread:00000000000c4d1e
> thread:00000000000c4d20

This looks like a bug to me. I was able to replicate it in my own mail
store with the script at the end of the message. I haven't completely
analyzed the situation yet, but one thing I noticed is that in all
"bad threads", there are files with duplicate message-ids. Typical
output looks like

╭─ zancas:software/upstream/notmuch/test 
╰─ (git)-[master]-% notmuch search thread:000000000001760a
thread:00000000000175e5  November 03 [1/2(3)]; Bug#846042: VTK 8 (unread)
thread:000000000001760a   2016-11-27 [1/2(3)]; Bug#846042: virtual/meta package for python-vtk (unread)

At least some of this mail data is public, but I'm not sure if the bad
threading is reproducible or not; I want to run a complete census
overnight before I reindex.

Even if the bug is non-deterministic, it probably lives in lib/


for id in $(notmuch search --output=threads '*'); do
    count=$((count +1))
    matches=$((`notmuch search --output=threads "$id" | wc -l`))
    if [ "$matches" = 1 ]; then
	success=$((success + 1))
        echo "bad thread: $id"
    if [ $((count % 1000)) -eq 0 ]; then
        echo $count;

echo "count=$count success=$success"
notmuch mailing list