Re: Inline attachments

Subject: Re: Inline attachments

Date: Fri, 16 May 2025 16:27:55 +0200

To: David Bremner

Cc: notmuch@notmuchmail.org

From: Michael J Gruber


Am Do., 15. Mai 2025 um 12:58 Uhr schrieb David Bremner <david@tethera.net>:
>
> Michael J Gruber <michaeljgruber+grubix+git@gmail.com> writes:
>
> >
> > It's easy to change notmuch to index them, too, but I'm wondering what
> > the right approach is:
> >
> > - treat inline attachments as attachments (same tag, same index
> > keyword/xapian term), possibly depending on a config flag
> > `index.inline`
> > - introduce a new tag and term, say `inline`
>
> I don't have strong opinions here. I suspect treating inline the same as
> attachements probably makes sense. In some sense I'd prefer to avoid
> another configuration flag, not because of implementation difficulty,
> but just because it is a more complex UI. I guess it would be useful if
> someone(TM) could do some kind of survey of what the usage of inline
> mime attachements is.

So someone(MG) checked their mail database of 108990 real world
e-mails. It contains:
37350 content parts with "content-disposition" set, with values:

22069  "attachment"
15279 "inline"
2 "Attachmant" :-)

Among the "inline" one, 5916 do not have a "filename" set. They boil down to:

   4413 inline,null,text/plain
    701 inline,null,message/rfc822
    473 inline,null,text/html
    175 inline,null,multipart/mixed
     72 inline,null,image/png
     28 inline,null,multipart/signed
     22 inline,null,application/pgp-signature
     12 inline,null,image/jpeg
      8 inline,null,image/gif
      6 inline,null,multipart/alternative
      2 inline,null,text/rfc822-headers
      2 inline,null,multipart/related
      2 inline,null,message/delivery-status

I guess we should not index any of them as attachments. assuming that
inline text is indexed anyways and the images are outliers. Or am I
wrong here?

Among the "inline" with "filename" set, the top ones are

   1387 inline,msg.asc,application/octet-stream
   1719 inline,encrypted.asc,application/octet-stream

with variations. Do we want to index them as attachments?

OTOH, the majority of "inline" with "filename" set are proper file
attachments, with a certain proportion of mere "logos".

I guess if plain inline attachments are indexed as body text then I
would skip indexing "content-disposition: inline" when "filename" is
not set. OTOH, we do index them for "attachment", the stat is:

    735 attachment,null,application/pgp-encrypted
    614 attachment,null,message/rfc822
      9 attachment,null,image/jpeg
      3 attachment,null,multipart/appledouble

Appledouble! An apple a day keeps the trouble away?

Cheers
Michael

P.S.: This was a fun exercise with `notmuch show --format-json` and `jq`.
_______________________________________________
notmuch mailing list -- notmuch@notmuchmail.org
To unsubscribe send an email to notmuch-leave@notmuchmail.org

Thread: