[RFC PATCH 3/3] lib: add gnulib get_date() based date range search

Subject: [RFC PATCH 3/3] lib: add gnulib get_date() based date range search

Date: Wed, 10 Aug 2011 10:22:07 +0000

To: notmuch@notmuchmail.org

Cc: amdragon@mit.edu

From: Jani Nikula


Add a custom value range processor to handle "date:" using get_date() from
gnulib. This enables date (and time) searches of the form
date:since..until, where "since" and "until" are expressions understood by
get_date(), compatible with most GNU programs. For the date input formats,
see the GNU coreutils manual:
http://www.gnu.org/software/coreutils/manual/html_node/Date-input-formats.html

Open-ended ranges are supported (since Xapian 1.2.1), i.e. you can specify
date:..until or date:since.. to not limit the start or end date,
respectively.

Note: The get_date() function has been renamed to parse_datetime() in
recent gnulib.

EXAMPLES:

date:2-weeks-ago..
date:today00:00:00..
date:yesterday00:00:00..today00:00:00
date:07/14/2011..2011-07-15

BUGS/CAVEATS:

At the moment it seems search phrases with spaces in them are not supported
in notmuch. For date: this means you can't specify a date with an
expression with spaces in it. In many (but apparently not all) cases you
can work around this by replacing spaces with '-' or simply leaving out
whitespace. For example, date:2-days-ago..yesterday00:00:00.

For the purpose of searching mail, the get_date() implementation has some
surprising interpretations. For example:

* Any date specification without time, such as date:yesterday.. or
  date:2011-08-10.., means that day at the same time as now (instead of
  00:00:00).

* As a consequence, ranges such as date:yesterday..yesterday match just the
  messages at the same time as now, to the second. The date parser should
  optimally make different kind of interpretations depending on whether
  parsing "since" or "until". For example, date:today..today should cover
  the whole day from beginning to end, date:today00:00:00..today23:59:59.

* date:monday.. means date:next-monday.. (rather than date:last-monday..)

* date:this-week.. really means date:now.. (you probably want
  date:1-week-ago-next-monday00:00:00.. or similar).

However, there is value in being compatible with GNU programs, and the
input formats have been rather well documented. It would be more surprising
to deviate from that, and it would also take some effort to do so,
including testing.
---
 lib/Makefile.local     |    1 +
 lib/database-private.h |    1 +
 lib/database.cc        |    4 ++++
 lib/getdate-proc.cc    |   32 ++++++++++++++++++++++++++++++++
 lib/getdate-proc.h     |   21 +++++++++++++++++++++
 5 files changed, 59 insertions(+), 0 deletions(-)
 create mode 100644 lib/getdate-proc.cc
 create mode 100644 lib/getdate-proc.h

diff --git a/lib/Makefile.local b/lib/Makefile.local
index a1c234f..b722343 100644
--- a/lib/Makefile.local
+++ b/lib/Makefile.local
@@ -63,6 +63,7 @@ libnotmuch_c_srcs =		\
 
 libnotmuch_cxx_srcs =		\
 	$(dir)/database.cc	\
+	$(dir)/getdate-proc.cc	\
 	$(dir)/directory.cc	\
 	$(dir)/index.cc		\
 	$(dir)/message.cc	\
diff --git a/lib/database-private.h b/lib/database-private.h
index f705009..d83ae3b 100644
--- a/lib/database-private.h
+++ b/lib/database-private.h
@@ -51,6 +51,7 @@ struct _notmuch_database {
     Xapian::QueryParser *query_parser;
     Xapian::TermGenerator *term_gen;
     Xapian::ValueRangeProcessor *value_range_processor;
+    Xapian::ValueRangeProcessor *getdate_proc;
 };
 
 /* Return the list of terms from the given iterator matching a prefix.
diff --git a/lib/database.cc b/lib/database.cc
index 9c2f4ec..b1a0732 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -19,6 +19,7 @@
  */
 
 #include "database-private.h"
+#include "getdate-proc.h"
 
 #include <iostream>
 
@@ -667,12 +668,14 @@ notmuch_database_open (const char *path,
 	notmuch->term_gen = new Xapian::TermGenerator;
 	notmuch->term_gen->set_stemmer (Xapian::Stem ("english"));
 	notmuch->value_range_processor = new Xapian::NumberValueRangeProcessor (NOTMUCH_VALUE_TIMESTAMP);
+	notmuch->getdate_proc = new GetDateValueRangeProcessor (NOTMUCH_VALUE_TIMESTAMP, "date:", true);
 
 	notmuch->query_parser->set_default_op (Xapian::Query::OP_AND);
 	notmuch->query_parser->set_database (*notmuch->xapian_db);
 	notmuch->query_parser->set_stemmer (Xapian::Stem ("english"));
 	notmuch->query_parser->set_stemming_strategy (Xapian::QueryParser::STEM_SOME);
 	notmuch->query_parser->add_valuerangeprocessor (notmuch->value_range_processor);
+	notmuch->query_parser->add_valuerangeprocessor (notmuch->getdate_proc);
 
 	for (i = 0; i < ARRAY_SIZE (BOOLEAN_PREFIX_EXTERNAL); i++) {
 	    prefix_t *prefix = &BOOLEAN_PREFIX_EXTERNAL[i];
@@ -716,6 +719,7 @@ notmuch_database_close (notmuch_database_t *notmuch)
     delete notmuch->query_parser;
     delete notmuch->xapian_db;
     delete notmuch->value_range_processor;
+    delete notmuch->getdate_proc;
     talloc_free (notmuch);
 }
 
diff --git a/lib/getdate-proc.cc b/lib/getdate-proc.cc
new file mode 100644
index 0000000..c384e5a
--- /dev/null
+++ b/lib/getdate-proc.cc
@@ -0,0 +1,32 @@
+
+#include "database-private.h"
+#include "getdate-proc.h"
+#include "getdate.h"
+
+/* see *ValueRangeProcessor in xapian-core/api/valuerangeproc.cc */
+Xapian::valueno
+GetDateValueRangeProcessor::operator()(std::string &begin, std::string &end)
+{
+    struct timespec result, now;
+
+    if (Xapian::StringValueRangeProcessor::operator()(begin, end) == Xapian::BAD_VALUENO)
+	return Xapian::BAD_VALUENO;
+
+    clock_gettime(CLOCK_REALTIME, &now);
+
+    if (!begin.empty()) {
+	if (!get_date(&result, begin.c_str(), &now) || result.tv_sec == -1)
+	    return Xapian::BAD_VALUENO;
+
+	begin.assign(Xapian::sortable_serialise((double)result.tv_sec));
+    }
+
+    if (!end.empty()) {
+	if (!get_date(&result, end.c_str(), &now) || result.tv_sec == -1)
+	    return Xapian::BAD_VALUENO;
+
+	end.assign(Xapian::sortable_serialise((double)result.tv_sec));
+    }
+
+    return valno;
+}
diff --git a/lib/getdate-proc.h b/lib/getdate-proc.h
new file mode 100644
index 0000000..706fc0a
--- /dev/null
+++ b/lib/getdate-proc.h
@@ -0,0 +1,21 @@
+
+#ifndef NOTMUCH_GETDATE_PROC_H
+#define NOTMUCH_GETDATE_PROC_H
+
+#include <xapian.h>
+
+/* see *ValueRangeProcessor in xapian-core/include/xapian/queryparser.h */
+class GetDateValueRangeProcessor : public Xapian::StringValueRangeProcessor {
+public:
+	GetDateValueRangeProcessor(Xapian::valueno slot_)
+		: StringValueRangeProcessor(slot_) { }
+
+	GetDateValueRangeProcessor(Xapian::valueno slot_,
+				   const std::string &str_,
+				   bool prefix_ = true)
+		: StringValueRangeProcessor(slot_, str_, prefix_) { }
+
+	Xapian::valueno operator()(std::string &begin, std::string &end);
+};
+
+#endif /* NOTMUCH_GETDATE_PROC_H */
-- 
1.7.1


Thread: