On Sat, 21 Nov 2009 23:26:15 +0100, Mike Hommey <mh+notmuch@glandium.org> wrote: > I just was able to reproduce after starting over. Thanks Mike. I was able to reproduce this as well by eliminating the spurious blank line I had on the 2nd or 3rd line. (So maybe that managed to sneak in when you sent me the message.) > header isn't "", and message_id is correctly filled. I can also confirm > the exception is thrown from notmuch->xapian_db->add_document. Yes. We were trying to add a term that is too long for Xapian. I've fixed this by simply falling back to our existing sha-1 code when a message ID is long. Thanks so much for the bug report! -Carl commit 5d56e931b99d575dbb0b936d24aae5e9903861ad Author: Carl Worth <cworth@cworth.org> Date: Sun Nov 22 04:03:49 2009 +0100 add_message: Use sha-1 in place of overly long message ID. Since Xapian has a limit on the maximum length of a term, we have to check for that before trying to add the message ID as a term. This fixes the bug reported by Mike Hommey here: <20091120132625.GA19246@glandium.org> I've also constructed 20 files with a range of message ID lengths centered around the Xapian term-length limit which I'll use to seed a new test suite soon. diff --git a/lib/database.cc b/lib/database.cc index 169dc5e..f4a445a 100644 --- a/lib/database.cc +++ b/lib/database.cc @@ -892,7 +892,7 @@ notmuch_database_add_message (notmuch_database_t *notmuch, const char *date, *header; const char *from, *to, *subject; - char *message_id; + char *message_id = NULL; if (message_ret) *message_ret = NULL; @@ -937,11 +937,20 @@ notmuch_database_add_message (notmuch_database_t *notmuch, header = notmuch_message_file_get_header (message_file, "message-id"); if (header && *header != '\0') { message_id = _parse_message_id (message_file, header, NULL); + /* So the header value isn't RFC-compliant, but it's * better than no message-id at all. */ if (message_id == NULL) message_id = talloc_strdup (message_file, header); - } else { + + /* Reject a Message ID that's too long. */ + if (message_id && strlen (message_id) + 1 > NOTMUCH_TERM_MAX) { + talloc_free (message_id); + message_id = NULL; + } + } + + if (message_id == NULL ) { /* No message-id at all, let's generate one by taking a * hash over the file's contents. */ char *sha1 = notmuch_sha1_of_file (filename);