Re: tell me how to do this right (mail sent to lists)

Subject: Re: tell me how to do this right (mail sent to lists)

Date: Fri, 12 Oct 2018 11:02:28 -0700

To: Jeff Templon, David Bremner, Daniel Kahn Gillmor

Cc: notmuch@notmuchmail.org

From: Carl Worth


On Wed, Oct 10 2018, Jeff Templon wrote:
>> The tag is not associated with the file in Sent, it is associated
>> with the message-id.
>
> I guess I didn't make myself clear enough, again.  I didn't mean that
> the tag is associated with the file.  What I am guessing is something
> like this:

Hi Jeff,

Thanks for persevering so that we can all try to understand what's
happening. I appreciate the patience on all sides. :-)

> for message in new_messages:
>    if message.id not in database:
>       process message and determine list of tags
>       appy those tags to the messageID

Well, there are actually a couple of different processing loops that you
might be describing with the above. Let me try to walk through things:

First, there's a loop where "notmuch new" finds previously-unseen files
and indexes the content, adding it to the database:

for message_file in new_files:
  message_id = get_headers_message_id (message_file)
  (message,is_new) = database_lookup (message_id)
  index_file (message, message_file)
  if is_new:
    add_new_message_tags (message)

The above pseudo-code is based on the loop in notmuch-new.c:add_files(),
add_file() as well as lib/add-message.cc:notmuch_database_index_file()
and more or less trying to use naming consistent with the code.

Something to not in the above loop: The database_lookup above, (which is
actually _notmuch_message_create_for_message_id), can either create a
new message object in the database or return an existing object. But,
either way, the content of the message will be indexed. So, the
significant feature is that notmuch will always be able to search the
content it indexes for any message file, (regardless of the order it was
encountered given any duplication).

However, as can also be seen in the above loop, tags that are added
to new messages, (these are as configured in the "new.tags" entry in
~/.notmuch-config) will only be added if the message is new in this pass
of notmuch-new.

I'm not looking into the code for afew right now, but I can guess a
couple of places the undesired bug could be happening:

1. It could be looping over all messages with the "new" tag. And if your
   sent message gets tagged "new" in a pass before the mailing-list
   duplicate is present, then afew will not have access to the
   mailing-list version when it does its processing. Then, later, when a
   pass does have the mailing-list duplicate present, it won't be
   considered a "new" message so would not get picked processed in a
   loop considering messages tagged "new".

2. It's possible that both message files are present at the time that
   afew does its processing, but that it only opens one of the files to
   go looking for the List-Id header, (which it must be doing
   somewhere---as David mentioned, the List-Id header is not ever
   indexed by notmuch itself).

And in the above discussion, I'm assuming that it's even notmuch-new
that's doing the detection of new files. Some people use mail flows that
have some external mechanism for processing new incoming mail and then
calling "notmuch insert" for each one.

In conclusion, you have a few different options to get reliable
behavior:

One option is to use notmuch-based searches to find the mailing-list
mail that is of interest for you. To do this you would want to key off
of a header that is indexed by notmuch. For example, you could do
something like:

	notmuch tag +my-list-tag to:my-list-recipient-address

Another option is to continue to tag messages by inspecting the file
(outside of notmuch) to look for a header like List-Id (like you are
apparently doing now). To make this reliably, you would simply want to
ensure that that processing happens on every new file that is added. And
note that the "new" tag as added by "notmuch new" is not reliable for
that. That tag _is_ reliable for learning that a new message ID has
become available in the database, but is not reliably for know that a
new message file has appeared, (for a message ID that was present
previously).

Does that help explain things?

-Carl





signature.asc (application/pgp-signature)
_______________________________________________
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch

Thread: