Re: RFC: adding larger test corpus, switching to xz

Subject: Re: RFC: adding larger test corpus, switching to xz

Date: Thu, 13 Apr 2017 22:07:36 -0300

To: notmuch@notmuchmail.org

Cc:

From: David Bremner


David Bremner <david@tethera.net> writes:

> I currently have some WIP code that passes all tests with our default
> corpus, but fails with the smallest performance corpus. The simplest
> thing to do would be to add a small sample from our performance corpus
> as one for our standard (correctness) suite. I'm currently looking at
> 146 LKML messages. Unpacked these are about 1.3M; they bloat the source
> tarball by about 285K, which is large in relative terms (about 40%), but
> small in absolute terms for most modern systems. If we switch to xz
> compression, the resulting tarball is only 711K.
>

In the end I found 210 messages (1 thread of 100, one of 48, assorted
smaller threads) that only bloated the source by 161k, so that I decided
to add the corpus. It's not used yet in the test suite, but it is needed
by a series I will post soon.

Thread: