Re: [PATCH 2/4] lib: handle empty string in regexp field processors

Subject: Re: [PATCH 2/4] lib: handle empty string in regexp field processors

Date: Thu, 23 Mar 2017 23:07:13 -0300

To: notmuch@notmuchmail.org

Cc:

From: David Bremner


David Bremner <david@tethera.net> writes:

> The non-field processor behaviour is is convert the corresponding
> queries into a search for the unprefixed terms. This yields pretty
> surprising results so decided to match any message.
> ---
>  lib/regexp-fields.cc      | 3 +++
>  test/T650-regexp-query.sh | 2 --
>  2 files changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/lib/regexp-fields.cc b/lib/regexp-fields.cc
> index 8e740a81..42239a66 100644
> --- a/lib/regexp-fields.cc
> +++ b/lib/regexp-fields.cc
> @@ -148,6 +148,9 @@ RegexpFieldProcessor::RegexpFieldProcessor (std::string prefix, Xapian::QueryPar
>  Xapian::Query
>  RegexpFieldProcessor::operator() (const std::string & str)
>  {
> +    if (str.size () == 0)
> +	return Xapian::Query::MatchAll;
> +

For things like file:, it actually makes more sense to return
Xapian::Query(term_prefix). I'm leaning towards something like the following

    if (str.size () == 0) {
       if (options & NOTMUCH_FIELD_PROBABILISTIC)
           return Xapian::Query::MatchAll;
       else
           return Xapian::Query(term_prefix);
    }
 

but this patch can stay as is for now, as there are not yet any boolean
fields being matched (mid would be the first).

I thought a bit about unconditionalling returning Xapian::Query
(term_prefix), but I think for the probabilistic (term based) fields,
we'll never add terms for only the prefix, i.e. and empty subject would
just not add any XSUBJECT terms. So that would effectively match no
messages.

d

Thread: