Re: Bug?: notmuch-search-show-thread shows several threads; only one containing matching messages

Subject: Re: Bug?: notmuch-search-show-thread shows several threads; only one containing matching messages

Date: Mon, 30 Jan 2012 00:36:33 +0100

To: notmuch

Cc:

From: Gregor Zattler


Hi Jani, notmuch developers,

executive summary: notmuch almangamates several e-mail threads
into one notmuch-thread, I consider this a bug:

* Jani Nikula <jani@nikula.org> [26. Jan. 2012]:
> On Thu, 26 Jan 2012 13:44:50 +0100, Gregor Zattler <telegraph@gmx.net> wrote:
>> * Jameson Graef Rollins <jrollins@finestructure.net> [25. Jan. 2012]:
>>> On Wed, 25 Jan 2012 20:19:03 -0500, Austin Clements <amdragon@MIT.EDU> wrote:
>>>> One very common cause of this is someone using "reply" to get an
>>>> initial set of recipients, but then replacing the entire message and
>>>> subject (presumably without realizing that the mail is still tracking
>>>> what it was a reply to).  This can also happen if someone
>>>> intentionally replies to multiple messages (though few mail clients
>>>> support this), or if there was a message ID collision.
>>> 
>>> This is a very common occurrence for me as well.  I would put money down
>>> that this is what you're seeing.
>> 
>> I thought about this too and this is why I checked for any
>> occurrence of Message-IDs in the other emails: 
>> 
>>    |> I isolated the thread I was interested in,
>>    |> extracted the message ids of its messages and greped the rest of
>>    |> the messages for this message ids: no matches.[2] Therefore no of
>>    |> the rests messages are part of the thread I was interested in
>> 
>> perhaps there was a logic error in how I did this:
>> 
>>    |> [2] grep -I "^Message-Id:" /tmp/thread-I-m-interested-in.mbox |sed -e "s/Message-Id: <//I" -e "s/>$//" >really.mid
>>    |>     grep -I -F really.mid rest.mbox
>>    |>     --> no match
>> /tmp/thread-I-m-interested-in.mbox  is a mbox with messages
>> I'minterested in, the "real" ones.  really.mid is a list of
>> Message-IDs of these "real" emails.  rest.mbox is a mbox with the
>> other emails, Emacs showed in his notmuch show buffer but are
>> other threads.
>> 
>> Since there is no match I concluded, the threads are not linked.
>> Perhaps I made a mistake.  I'l retest it and report again.  But
>> right now I don't have the time to do this.

I re-did it.  This time I used the Emacs interface, searched for
folder:orgmode date 64 bit 32 
and in the notmuch-search -buffer I used notmuch-search-stash-thread-id to
get the internal thread-number.  I then did a

notmuch show --format=mbox thread:00000000000108e0 >thread.mbox

opened this mbox with mutt, saved the one thread about dates
before 1970 in one maildir
`date64bit32-I-am-interested-in.mailbox' and the rest in a
maildir `other-e-mails.mailbox'.

I produced a list of all Message-Ids of the interesting thread by
doing

rgrep -E -i "^Message-Id:[[:space:]]" date64bit32-I-am-interested-in.mailbox|egrep -o "[^<]+@[^>]+" >date64bit32-I-am-interested-in.mid

and searched for this strings in the other e-mails:

rgrep -F date64bit32-I-am-interested-in.mid other-e-mails.mailbox

No hits.

I also did it the other way around:

rgrep -E -i "^Message-Id:[[:space:]]" other-e-mails.mailbox|egrep -o "[^<]+@[^>]+" >other-e-mails.mid

rgrep -F other-e-mails.mid date64bit32-I-am-interested-in.mailbox

No hits.

(I spared me the hassle to search for the Message-Ids in correct
headers only, there are simply no hits anywhere in this other e-mails.

Thus I conclude that notmuch amalgamates different e-mail-threads
into one as represented by one thread-id.

I consider this a bug.

If anybody is interested I can email her/him the mbox file with
the relevant thread (minus privacy relevant headers / 300 KiB gzipped).

> Do you have an mbox file in the maildir indexed by notmuch? That seems
> like the issue.

I don't think so:  I rgreped for files with more than 1 line
beginning with "Message-Id".  I got 38 hits.  I looked at all of
them, they are no mbox files (at least no valid ones) but e-mails
with other e-mails attached, or cited or in one case a
multipart/mixed message with plain text part and html part.

Nonetheless I isolated all Message-Ids from these 38 files,
eliminated some html artefacts and greped for this in
date64bit32-I-am-interested-in.mailbox and other-e-mails.mailbox:
No hits with either file.  I also did it the other way around:
Searching for the Message-ids of the two sets in the 38 potential
mbox files: No hit.

Ciao, Gregor
-- 
 -... --- .-. . -.. ..--.. ...-.-

Thread: