Re: Possible threading issues in nm 0.32

Subject: Re: Possible threading issues in nm 0.32

Date: Tue, 11 May 2021 17:41:25 +0200

To: Alexander Adolf, David Bremner, notmuch@notmuchmail.org

Cc:

From: Michael J Gruber


Alexander Adolf venit, vidit, dixit 2021-05-11 16:32:22:
> Alexander Adolf <alexander.adolf@condition-alpha.com> writes:
> 
> > Michael J Gruber <git@grubix.eu> writes:
> >
> >> [...]
> >> So it seems:
> >>
> >> - The mis-threading happens during `notmuch new`, not with `notmuch
> >>   reindex`.
> >> - In this new case (and if I remember correctly also in the others),
> >>   it's always a new message getting worngly put into an existing thread,
> >>   and if I'm not mistaken, the existing thread was tagged as trash
> >>   before in all cases.
> >> [...]
> >
> > I can confirm both observations.
> > [...]
> 
> p.s.: Just got the weird threading with `notmuch reindex`, too.

Oh my gosh... This is getting interesting.

I'm delving (literally) into the xapian db now. "Put into an existing
thread" (what I had wiritten) was not correct in terms of thread IDs.

What's happening is the following:

I have an existing message A which is tagged as trash.
A is the only message in thread 0000000000021144.

A new message B is put in the db by "notmuch new".
Notmuch correctly creates a new thread 0000000000021148 (the next
avalaible id) and puts B in this new thread.
G0000000000021148 is the only thread term in the db for the document
belonging to message B.

So far so good, but: The document for message A has three thread terms
now:
G0000000000021144 G0000000000021148 G0000000000021149

Note that neither A nor B have any in-reply-to or references header.

AFAIK multiple thread terms on a single message document are a complete
no-go and indicate a problem, especially when an unrelated existing message's
entry is touched.

notmuch search --exclude=false thread:0000000000021148 lists both A and
B now, of course.

The third one, G0000000000021149, is completely weird. It leads to yet
another message document with multiple thread entries.

Looking at a few of the most recent messages, my suspicion is:
- A new message with in-reply-to/references get's a single (existing)
  thread term correctly.
- A new message without in-reply-to/references get's the correct new
  thread term; in addition, this get's assigned to some random existing
  message by *adding* it to the list of terms, thereby making that
  message part of multiple threads.

I have not checked systematically yet whether it (the multi-G-terms)
indeed affects Ktrash ones only.

Michael
_______________________________________________
notmuch mailing list -- notmuch@notmuchmail.org
To unsubscribe send an email to notmuch-leave@notmuchmail.org

Thread: