Re: searching for a message by path

Subject: Re: searching for a message by path

Date: Sat, 21 Sep 2024 11:38:18 +0200

To: frederik@ofb.net

Cc: Pengji Zhang, notmuch@notmuchmail.org

From: Michael J Gruber


Am Sa., 21. Sept. 2024 um 05:23 Uhr schrieb Frederick Eaton <frederik@ofb.net>:
>
> Thank you for your response, Pengji.
>
> On Sat, Sep 21, 2024 at 08:25:10AM +0800, Pengji Zhang wrote:
> >Hi Frederick,
> >
> >Frederick Eaton <frederik@ofb.net> writes:
> >
> >>I am trying to figure out how to adapt a script I wrote for
> >>filtering messages, to apply notmuch tags to each message. A
> >>difficulty is that the messages are already in the Notmuch database,
> >>because another tool has delivered them to a maildir and run
> >>"notmuch new".
> >>
> >>Now, Notmuch can provide me with the paths of all the new
> >>(unfiltered) messages, which I can give to my script. The question I
> >>have is, once the filter is done, how can the script tell Notmuch
> >>which message to apply the tags to?
> >
> >
> >I am not sure if I understand you correctly. If the problem here is to
> >distinguish existing messages and new messages, would the config
> >option 'new.tags' work? For example, use
> >
> >   notmuch config set new.tags new
> >
> >to give all new messages a 'new' tag.
>
> No, I already have that configuration. The first sentence described what I already know how to do, the second sentence is what I'm trying to do.

It seems that we're still guess-working-out what your script is
doing/trying to do. Do you mind sharing a trimmed down version?

> It might be useful for the reasons I stated, namely in case the Message-ID does not exist or is not unique.

This is probably at the heart of the problem. Within notmuch, a
"message" is something identified by a message-id (mid), and all
information in the notmuch database is tied to a mid.

When you speak about a message, you probably mean the content of an
individual "message file" - which is a natural, but different notion.
A "path:" refers to a message file, a "mid:" to message id.

When "notmuch new" encounters a new message files, it
- checks if it contains a valid "Message-ID" header
- used that as mid or generates a mid using a sha1 checksum of the message file
- checks whether that mid (!) is in the database already
- adds the path to the existing db entry, or creates a new db entry

So, you may have several files (path entries) for the same mid, and
which one is used for indexing purposes depends on the order of
arrival (or, in the case of reindexing, probably on file system
ordering). notmuch assumes that this makes no difference - same mid
same "message". This assumption can break, for example for list
copies, different headers on sent versus received etc.

I"m elaborating on this because we have to guess about your script -
what is a "new message" for your script, and which kind of information
does it want to process?

Typical processing would be done in a notmuch post-hook, and it would:
- check for new messages (tag:new)
- get their file paths form `notmuch search --output=files mid:XYZ` or such
- do whatever it needs using the file if you really need to parse that yourself

I guess most of us have some sort of script running on new messages as
part of a hook, be it `afew` or something homegrown, and this
typically clears the new tag afterwards.

Michael
_______________________________________________
notmuch mailing list -- notmuch@notmuchmail.org
To unsubscribe send an email to notmuch-leave@notmuchmail.org

Thread: