Subject: Re: finding file by size

Date: Wed, 07 Nov 2018 13:14:37 -0800

To: Ralph Seichter,


From: Carl Worth

On Tue, Nov 06 2018, Ralph Seichter wrote:
> I'm not sure about using Notmuch itself,

Right. Notmuch doesn't currently index (as far as I'm aware) anything
that would be useful for sorting by size.

> but this should work:
>   find /path/to/maildir -type f -size +50M | xargs rm

Hmm... I imagine that Mark would be more interested in viewing these
files to ensure they are what he thinks they are before deleting them.

So, capturing the results of that with a notmuch tag would be
a reasonable thing to do. The only trick there is that I don't see any
existing search term to find a message associated with a particular file
name, (we have "path:" and "folder:" to find messages in a specific
directory, but nothing I see for finding the message corresponding to a
specific file).

So, then we could extract the message-id from each file and do a search
based on that I guess?

Here's a (bash) command I just ran on my mail store of over a million
messages that tagged the 8 messages larger than 50MB. It took about a
minute to run (with a warm cache):

  for msg in $(
               for file in $(find . -type f -size +50M);
                 grep -i ^Message-Id $file | sed -e 's/^.*<\(.*\)>.*/\1/';
    notmuch tag +large id:$msg;

With that, I'm able to go through the list from:

  notmuch search tag:large

to investigate whether these large emails are worth keeping.

So, that's obviously not extremely elegant, but it's at least possible.


