Re: mass removal of duplicates

Subject: Re: mass removal of duplicates

Date: Thu, 31 Jul 2025 20:24:30 +0200

To: Tomas Hlavaty, notmuch

Cc:

From: Alan Schmitt


On 2025-07-31 19:27, Tomas Hlavaty <tom@logand.com> writes:

> On Thu 31 Jul 2025 at 12:27, Alan Schmitt <alan.schmitt@polytechnique.org> wrote:
>> (which is significant as all my messages where duplicated). Is there a
>> way to do this using notmuch (as it is already identifying duplicates),
>> or do you recommend another tool (maybe a file deduplication tool)?
>
> copy all messages to a directory, where each message filename is
> sha256sum of the file contents, then remove the originals
>
> notmuch does not seem to care much about the message filenames,
> except reindex might take a long time

Unfortunately not only mbsync cares about the file name, but it adds a
header when syncing, so the files are not fully identical (I tried
running fdupes before realizing this).

Alan
signature.asc (application/pgp-signature)
_______________________________________________
notmuch mailing list -- notmuch@notmuchmail.org
To unsubscribe send an email to notmuch-leave@notmuchmail.org

Thread: