should filter out replies when indexing

Subject: should filter out replies when indexing

Date: Sun, 9 Mar 2025 13:41:25 +0100

To: notmuch@notmuchmail.org

Cc:

From: Martin Monperrus


Hi Notmuch team, Here is a bug report. Thanks, --Martin

## Actual behavior

Notmuch indexes all messages including the replied content.

This is a problem because when one searches for a message with content, we get all emails replying 
to it.

m1 with "foobar"
-> m2 with "> foobar"
-> m3 with ">> foobar"

search("foobar") = = [m1, m2, m3]

In the case of dozens of messages in a thread, one does not know which one to open.

## Expected behavior

Notmuch indexes messages after having stripped the original.

search("foobar") = = [m1]

## Notes

There are different libraries for stripping the replied message. For example

Ruby: https://github.com/github/email_reply_parser
Python: https://github.com/zapier/email-reply-parser
Python: https://github.com/mailgun/talon
Python https://github.com/lawrencepit/email_reply_parser
Python https://github.com/alfonsrv/mailparser-reply
Python https://github.com/closeio/quotequail/
JavaScript: https://github.com/turt2live/node-email-reply-parser
Java: https://github.com/Driftt/EmailReplyParser
PHP: https://github.com/willdurand/EmailReplyParser
Golang https://github.com/web-ridge/email-reply-parser







_______________________________________________
notmuch mailing list -- notmuch@notmuchmail.org
To unsubscribe send an email to notmuch-leave@notmuchmail.org

Thread: