[PATCH] lib: add support for thread:<message-id> queries

Subject: [PATCH] lib: add support for thread:<message-id> queries

Date: Mon, 16 Oct 2017 22:07:54 +0300

To: notmuch@notmuchmail.org

Cc:

From: Jani Nikula


Add support for querying threads using message-ids in addition to
thread-ids. The main benefit is that thread queries via message-ids
are portable across databases, re-indexing, and thread joining, while
thread ids can be somewhat transient. A thread:<message-id> query can
be shared between notmuch users, while a thread:<thread-id> query may
cease to work after the regular delivery of a message that joins
threads.

What previously required:

$ notmuch search $(notmuch search --output=threads id:<message-id>)

can now be reduced to:

$ notmuch search thread:<message-id>

We limit the query to message-ids that have @ to optimize regular
thread-id queries and to avoid collisions with thread-ids which are
guaranteed (or which we can guarantee) to not contain @. This seems
reasonable, as valid message-ids will have @.

The performance penalty for regular thread-id queries seems to be
neglible, and thread queries via message-ids seem to be about 10%
slower than the direct thread-id query counterpart.
---
 doc/man7/notmuch-search-terms.rst |  9 ++++---
 lib/Makefile.local                |  1 +
 lib/database.cc                   |  6 ++++-
 lib/thread-fp.cc                  | 53 +++++++++++++++++++++++++++++++++++++++
 lib/thread-fp.h                   | 39 ++++++++++++++++++++++++++++
 5 files changed, 104 insertions(+), 4 deletions(-)
 create mode 100644 lib/thread-fp.cc
 create mode 100644 lib/thread-fp.h

diff --git a/doc/man7/notmuch-search-terms.rst b/doc/man7/notmuch-search-terms.rst
index b27f31f7545c..f3723ceb959b 100644
--- a/doc/man7/notmuch-search-terms.rst
+++ b/doc/man7/notmuch-search-terms.rst
@@ -52,7 +52,7 @@ indicate user-supplied values):
 
 -  id:<message-id>
 
--  thread:<thread-id>
+-  thread:<message-id> or thread:<thread-id>
 
 -  folder:<maildir-folder>
 
@@ -101,10 +101,13 @@ For **id:**, message ID values are the literal contents of the
 Message-ID: header of email messages, but without the '<', '>'
 delimiters.
 
-The **thread:** prefix can be used with the thread ID values that are
+The **thread:** prefix can be used with either the message ID of any
+of the messages in the thread or the thread ID values that are
 generated internally by notmuch (and do not appear in email messages).
 These thread ID values can be seen in the first column of output from
-**notmuch search**
+**notmuch search**. The thread ID values are transient. Thread queries
+by message ID are only available if notmuch is built with **Xapian
+Field Processors** (see below).
 
 The **path:** prefix searches for email messages that are in
 particular directories within the mail store. The directory must be
diff --git a/lib/Makefile.local b/lib/Makefile.local
index 8aa03891d775..123ef04c64a9 100644
--- a/lib/Makefile.local
+++ b/lib/Makefile.local
@@ -58,6 +58,7 @@ libnotmuch_cxx_srcs =		\
 	$(dir)/query-fp.cc      \
 	$(dir)/config.cc	\
 	$(dir)/regexp-fields.cc	\
+	$(dir)/thread-fp.cc	\
 	$(dir)/thread.cc
 
 libnotmuch_modules := $(libnotmuch_c_srcs:.c=.o) $(libnotmuch_cxx_srcs:.cc=.o)
diff --git a/lib/database.cc b/lib/database.cc
index 35c66939bdcf..82d2dcf22aa0 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -21,6 +21,7 @@
 #include "database-private.h"
 #include "parse-time-vrp.h"
 #include "query-fp.h"
+#include "thread-fp.h"
 #include "regexp-fields.h"
 #include "string-util.h"
 
@@ -258,7 +259,8 @@ prefix_t prefix_table[] = {
     { "directory",		"XDIRECTORY",	NOTMUCH_FIELD_NO_FLAGS },
     { "file-direntry",		"XFDIRENTRY",	NOTMUCH_FIELD_NO_FLAGS },
     { "directory-direntry",	"XDDIRENTRY",	NOTMUCH_FIELD_NO_FLAGS },
-    { "thread",			"G",		NOTMUCH_FIELD_EXTERNAL },
+    { "thread",			"G",		NOTMUCH_FIELD_EXTERNAL |
+						NOTMUCH_FIELD_PROCESSOR },
     { "tag",			"K",		NOTMUCH_FIELD_EXTERNAL |
 						NOTMUCH_FIELD_PROCESSOR },
     { "is",			"K",		NOTMUCH_FIELD_EXTERNAL |
@@ -317,6 +319,8 @@ _setup_query_field (const prefix_t *prefix, notmuch_database_t *notmuch)
 	    fp = (new DateFieldProcessor())->release ();
 	else if (STRNCMP_LITERAL(prefix->name, "query") == 0)
 	    fp = (new QueryFieldProcessor (*notmuch->query_parser, notmuch))->release ();
+	else if (STRNCMP_LITERAL(prefix->name, "thread") == 0)
+	    fp = (new ThreadFieldProcessor (notmuch))->release ();
 	else
 	    fp = (new RegexpFieldProcessor (prefix->name, prefix->flags,
 					    *notmuch->query_parser, notmuch))->release ();
diff --git a/lib/thread-fp.cc b/lib/thread-fp.cc
new file mode 100644
index 000000000000..f15aabba18c0
--- /dev/null
+++ b/lib/thread-fp.cc
@@ -0,0 +1,53 @@
+/* thread-fp.cc - "thread:" field processor glue
+ *
+ * This file is part of notmuch.
+ *
+ * Copyright © 2017 Jani Nikula
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 3 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see https://www.gnu.org/licenses/ .
+ */
+
+#include "database-private.h"
+#include "thread-fp.h"
+
+#if HAVE_XAPIAN_FIELD_PROCESSOR
+Xapian::Query
+ThreadFieldProcessor::operator() (const std::string & thread)
+{
+    std::string term_prefix = _find_prefix ("thread");
+    std::string thread_id;
+    notmuch_message_t *message = NULL;
+
+    /*
+     * Only support thread queries via message-ids that have @. The
+     * reason for this is twofold: First, it's an optimization for
+     * regular thread-id queries. Second, and more importantly, it
+     * prevents (potentially malicious) message-id and thread-id
+     * collisions. Regular thread-ids are guaranteed to not have @ in
+     * them, and we only lose the ability to query threads by invalid
+     * message-ids that don't have @ in them.
+     */
+    if (strchr (thread.c_str (), '@'))
+	notmuch_database_find_message (notmuch, thread.c_str (), &message);
+
+    if (message) {
+	thread_id = notmuch_message_get_thread_id (message);
+	notmuch_message_destroy (message);
+    } else {
+	thread_id = thread;
+    }
+
+    return Xapian::Query (term_prefix + thread_id);
+}
+#endif
diff --git a/lib/thread-fp.h b/lib/thread-fp.h
new file mode 100644
index 000000000000..3f88a5831b6b
--- /dev/null
+++ b/lib/thread-fp.h
@@ -0,0 +1,39 @@
+/* thread-fp.h - thread field processor glue
+ *
+ * This file is part of notmuch.
+ *
+ * Copyright © 2017 Jani Nikula
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 3 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see https://www.gnu.org/licenses/ .
+ */
+
+#ifndef NOTMUCH_THREAD_FP_H
+#define NOTMUCH_THREAD_FP_H
+
+#include <xapian.h>
+#include "notmuch.h"
+
+#if HAVE_XAPIAN_FIELD_PROCESSOR
+class ThreadFieldProcessor : public Xapian::FieldProcessor {
+ protected:
+    notmuch_database_t *notmuch;
+
+ public:
+    ThreadFieldProcessor (notmuch_database_t *notmuch_)
+	: notmuch(notmuch_) { };
+
+    Xapian::Query operator()(const std::string & str);
+};
+#endif
+#endif /* NOTMUCH_THREAD_FP_H */
-- 
2.11.0

_______________________________________________
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch

Thread: