Re: [PATCH 1/7] test: add known broken test for indexing html

Subject: Re: [PATCH 1/7] test: add known broken test for indexing html

Date: Thu, 20 Apr 2017 10:05:51 -0000

To: notmuch@notmuchmail.org

Cc:

From: David Bremner


David Bremner <david@tethera.net> writes:

> 'quite' on IRC reported that notmuch new was grinding to a halt during
> initial indexing, and we eventually narrowed the problem down to some
> html parts with large embedded images. These cause the number of terms
> added to the Xapian database to explode (the first 400 messages
> generated 4.6M unique terms), and of course the resulting terms are
> not much use for searching.
>
> The second test is sanity check for any "improved" indexing of HTML.

pushed the first patch in the series to master.

d

Thread: