Re: [PATCH 1/4] emacs: new customization variable to exclude "deleted" messages from search

Subject: Re: [PATCH 1/4] emacs: new customization variable to exclude "deleted" messages from search

Date: Sun, 8 Jan 2012 21:46:10 -0500

To: Jameson Graef Rollins

Cc: Notmuch Mail

From: Austin Clements


Quoth Jameson Graef Rollins on Jan 08 at  6:34 pm:
> On Sun, 8 Jan 2012 20:49:38 -0500, Austin Clements <amdragon@MIT.EDU> wrote:
> > > > @@ -927,6 +932,9 @@ The optional parameters are used as follows:
> > > >      (set 'notmuch-search-target-thread target-thread)
> > > >      (set 'notmuch-search-target-line target-line)
> > > >      (set 'notmuch-search-continuation continuation)
> > > > +    (when (and notmuch-search-exclude-deleted
> > > > +	       (not (string-match "tag:deleted[ )]*" query)))
> > > 
> > > “is:” is a synonym for “tag:” in searches – so this section of the code
> > > should look for it too.
> > 
> > There are several other things that could also trip up this regexp.
> > xtag:deletedx would be falsely matched, as would a quoted phrase
> > containing "tag:deleted", while tag:"deleted" and tag:(deleted) would
> > incorrectly not be matched.
> 
> Thanks so much for the review, guys.  I should have mentioned in this
> patch that the my regex skills are very weak, and that it was surely
> incomplete.  I always forget about the is: prefix as well.
> 
> > Getting this right is hard, though I'd be happy with
> > 
> >   "\\<\\(tag\\|is\\):deleted\\>"
> 
> Every time I think I start to understand regex I am reminded that it's
> black magic and I really know nothing.  For instance, I am not familiar
> with "<" or ">", although they appear to be a "word" boundaries
> (although I'm not sure how "word" is defined).  Also, why is all the \\
> (double?)  escaping needed?  I'll certainly take your word for it,
> though.

I'm not positive, but I think \> matches on the transition from a
"word-constituent" character to a non-word-constituent character, as
defined by Emacs' active syntax table.

The slashes are all doubled because I was writing it as an Emacs
string for easy pasting (sorry, I should have been explicit about
that).  The regexp itself is

  \<\(tag\|is\):deleted\>

> > or maybe
> > 
> >   "\\<\\(tag\\|is\\):\\(\"?\\)deleted\\>\\2"
> 
> After staring at this for 10 minutes I think I'm getting the extra bits
> here.  It matches an initial \", and then a second at the end if the
> first matched.  That's clever.  Why 

Exactly.

>   \\>\\2
> 
> instead of
> 
>  \\2\\>
> 
> ?

Okay, that can qualify as black magic.  The problem is that a " will
mess up the word-boundary matching because " isn't a word constituent
character.  So, if it is looking for a quote at the end, the \2 in
\2\> would match and consume the ", but then the \> wouldn't match.

Thread: