Re: find threads where I and Jian participated but not Dave

Subject: Re: find threads where I and Jian participated but not Dave

Date: Thu, 15 Jun 2017 18:07:53 -0700

To: David Bremner, Daniel Kahn Gillmor, Xu Wang, notmuch@notmuchmail.org

Cc:

From: Matt Armstrong


David Bremner <david@tethera.net> writes:

> Daniel Kahn Gillmor <dkg@fifthhorseman.net> writes:
>
>>
>> One of my long-standing wishes is to be able to say "show me mails in my
>> inbox from people who have replied to messages i've sent them".
>>
>> This could be re-framed as "show me threads in which i've participated,
>> where there are some messages flagged with 'inbox'".  but generating a
>> huge list of all threads in which i've participated, just to be able to
>> do an intersection operation with a (much smaller) list of all threads
>> that have a message with the inbox flag seems like a pretty gross
>> inefficiency.
>
> At the moment the best we could do is essentially the same algorithm,
> but in C instead of shell / python. Threads are not documents in the
> database, so they can't efficiently be searched for.  Of course we could
> change that, but those kind of changes take a fair amount of effort, and
> some careful design work.

Even if the C level does the same algorithm, it may be able to do some
optimizations on behalf of the "scripting layer" queries.

I suspect that a separate "thread based" query language may be an
interesting area of investigation.

Taking Daniel's last example, "show me mails in my inbox from people who
have replied to messages I've sent them".  That isn't even an entirely
unambiguous query specification.  What is *actually* desired:

a) show me messages from X that are part of threads where at least one
message is in the inbox and for which at least one message is from me.

or,

b) same as (a) but the "message from X" must be in the inbox (not just
any other message in the thread)

or,

c) same as (a) or (b) but the "message from X" is a reply (e.g. dated
after, or in-reply-to) a message from me.

or,

d) same as (c) but "message from X" is "unread", etc.

Like David's 'comm -12 A B' solution, these pretty quickly start looking
like multi-pass, or structed/nested, queries.  They are a lot more like
relational database queries (SQL) than the single-pass, flat (NoSQL)
queries we typically use with notmuch.

Thread: