Re: Inconsistent query results

Subject: Re: Inconsistent query results

Date: Wed, 08 Mar 2017 22:32:56 -0400

To: Kirill A. Shutemov, notmuch@notmuchmail.org

Cc: xapian-discuss@lists.xapian.org

From: David Bremner


"Kirill A. Shutemov" <kirill@shutemov.name> writes:

> Hello,
>
> I found that on particular queries notmuch return different results if run
> the query few times. Re-initialing the query or db doesn't help.
>
> I've attached test case along with corpus of messages.
>
> Unpack the archive and run `make' there. It will initialize the notmuch
> database for the corpus, build and run the test-case.

Thanks for the report. I don't yet understand where the bug is, but I
think it's safe to say it's not in your code. I made a somewhat simpler
test case that displays the same problem (at the end).

I'm also fairly sure this is different than the exclude related bug I
recently fixed in notmuch, since running your test under
"NOTMUCH_DEBUG_QUERY=yes ./test" shows the same xapian query is used
both times.

One thing I noticed is that if run both your test case and mine under
valgrind, I get a report of some uninitialized memory. The reports are
similar in both cases, here is part of the report from my test case

==11180== Conditional jump or move depends on uninitialised value(s)
==11180==    at 0x5F5D0B1: OrPostList::check(unsigned int, double, bool&) (orpostlist.cc:198)
==11180==    by 0x5F4E171: check_helper (multiandpostlist.h:97)
==11180==    by 0x5F4E171: MultiAndPostList::find_next_match(double) (multiandpostlist.cc:217)
==11180==    by 0x5F44C3F: skip_to_handling_prune (branchpostlist.h:98)
==11180==    by 0x5F44C3F: AndNotPostList::advance_to_next_match(double, Xapian::PostingIterator::Internal*) (andnotpostlist.cc:50)
==11180==    by 0x5F4E317: next_helper (multiandpostlist.h:76)
==11180==    by 0x5F4E317: MultiAndPostList::next(double) (multiandpostlist.cc:238)
==11180==    by 0x5F4FACC: next_handling_prune (branchpostlist.h:85)
==11180==    by 0x5F4FACC: MultiMatch::get_mset(unsigned int, unsigned int, unsigned int, Xapian::MSet&, Xapian::Weight::Internal&, Xapian::MatchDecider const*, Xapian::KeyMaker const*) (multimatch.cc:570)
==11180==    by 0x5E485CE: Xapian::Enquire::Internal::get_mset(unsigned int, unsigned int, unsigned int, Xapian::RSet const*, Xapian::MatchDecider const*) const (omenquire.cc:581)
==11180==    by 0x5E48913: Xapian::Enquire::get_mset(unsigned int, unsigned int, unsigned int, Xapian::RSet const*, Xapian::MatchDecider const*) const (omenquire.cc:939)
==11180==    by 0x4E52FB8: _notmuch_query_count_documents (query.cc:679)
==11180==    by 0x108A99: doit (test.c:17)
==11180==    by 0x108B0C: main (test.c:28)
==11180== 
count1: 2, count2: 2

Notice that under valgrind the counts match, which strongly suggests
that whatever is going on here is related to a memory error.

In case someone on the xapian list wants to play with this, you can grab
Kirill's test corpus and driver from

wget http://notmuchmail.org/pipermail/notmuch/attachments/20170308/fa83965a/attachment-0001.xz
tar Jxvf attachment-0001.xz


Thread: