Austin Clements <aclements@csail.mit.edu> writes: > On Mon, 03 Dec 2012, david@tethera.net wrote: >> From: David Bremner <bremner@debian.org> >> >> It's a bit annoying to call tar twice, but we cache the results so it >> isn't as bad as it could be. > Why not --strip-components=1 and unpack both mail/ and tags/ into a > single, shared corpus cache directory in one call to tar? Since you're > going to cp -lr things anyway, you can structure the corpus cache > however is convenient. It's a good suggestion. The only downside is duplicating the tags. I suppose on the scale of things that isn't a very big waste of space; the tag corpus is currently 300k, which is dwarfed by even the "small" corpus. d