On Wed, 11 Sep 2013, Daniel Kahn Gillmor <dkg@fifthhorseman.net> wrote: > On 09/10/2013 06:35 PM, Austin Clements wrote: > >> I haven't looked at exactly what workarounds this enables, but if it's >> what I'm guessing (RFC 2047 escapes in the middle of RFC 2822 text >> tokens), are there really subject lines that this will misinterpret >> that weren't obviously crafted to break the workaround? > > not to get all meta, but i imagine subject lines that refer an example > of this particular issue (e.g. when talking about RFC 2047) will break > ;) I'm trying one variant here. The meta reply here, running the patch. The broken RFC 2047 got liberally accepted. :) >> The RFC 2047 >> escape sequence was deliberately designed to be obscure, since RFC >> 2047 itself caused previously "standards-compliant" subject lines to >> potentially be interpreted differently. > > right, and it was designed explicitly to put the boundary markers atword > boundaries, and not in the middle of a word (i think that's what this is > all about, right?). so implementations which put the boundary markers > in the middle of a word, or which include whitespace within the encoded > text, aren't speaking RFC 2047. > > anyway, if there's a rough consensus to go forward with this, i'm not > about to block it. I understand that a large part of the business of > being an MUA is working around other people's bugs instead of expecting > them to fix them :/ I just don't like mis-rendering other text. I share your concern. Yet the amount of email with unintentionally broken encoding is much greater than the amount of email that has intentional character sequences that resemble broken encodings. Which is why I'm willing to sacrifice the latter to improve the user experience for majority of users. YMMV. BR, Jani.