David Bremner <david@tethera.net> writes: > From: David Bremner <david@tethera.net> > Subject: Re: RFC: drop html tags > To: Steven Allen <steven@stebalien.com> > Date: Tue, 21 Mar 2017 14:03:10 -0300 > > Steven Allen <steven@stebalien.com> writes: > >> In the JavaScript regex format, I believe the correct way to parse this is: >> >> /<("[^"]*"|'[^']*'|[^"'>]*)*>/g >> >> Basically, while inside a tag, ignore everything between double and single quotes. > > Thanks for the reality check. It should be possible to handle quotes. In > my limited understanding of that regex, we can do a bit better by > forcing pairs of quotes to match, since I <chaos attribute="'"> is > probably legal. Actually, I'm wrong. My eyes just glaze over when faced with any non-trivial regex, I guess. d