"Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> writes: > Greetings-- > If I search for threads matching a specific thread-id, I am seeing > multiple results: > > $ notmuch search --output=threads thread:00000000000c4d20 > thread:00000000000c4d1e > thread:00000000000c4d20 This looks like a bug to me. I was able to replicate it in my own mail store with the script at the end of the message. I haven't completely analyzed the situation yet, but one thing I noticed is that in all "bad threads", there are files with duplicate message-ids. Typical output looks like ╭─ zancas:software/upstream/notmuch/test ╰─ (git)-[master]-% notmuch search thread:000000000001760a thread:00000000000175e5 November 03 [1/2(3)] 128@gmx.us; Bug#846042: VTK 8 (unread) thread:000000000001760a 2016-11-27 [1/2(3)] 128@gmx.us; Bug#846042: virtual/meta package for python-vtk (unread) At least some of this mail data is public, but I'm not sure if the bad threading is reproducible or not; I want to run a complete census overnight before I reindex. Even if the bug is non-deterministic, it probably lives in lib/add-message.cc ---------------------------------------------------------------------- count=0 success=0 for id in $(notmuch search --output=threads '*'); do count=$((count +1)) matches=$((`notmuch search --output=threads "$id" | wc -l`)) if [ "$matches" = 1 ]; then success=$((success + 1)) else echo "bad thread: $id" fi if [ $((count % 1000)) -eq 0 ]; then echo $count; fi done echo "count=$count success=$success" _______________________________________________ notmuch mailing list notmuch@notmuchmail.org https://notmuchmail.org/mailman/listinfo/notmuch