Re: [PATCH 3/7] lib: Make notmuch_query_search_messages set the exclude flag

Subject: Re: [PATCH 3/7] lib: Make notmuch_query_search_messages set the exclude flag

Date: Tue, 31 Jan 2012 11:45:41 +0000

To: Austin Clements

Cc: notmuch@notmuchmail.org

From: Mark Walters


On Mon, 30 Jan 2012 23:43:52 -0500, Austin Clements <amdragon@MIT.EDU> wrote:
> Quoth Mark Walters on Jan 29 at  6:39 pm:
> > Add a flag NOTMUCH_MESSAGE_FLAG_EXCLUDED which is set by
> > notmuch_query_search_messages for excluded messages. Also add an
> > option omit_excluded_messages to the search that we do not want the
> > excludes at all.
> > 
> > This exclude flag will be added to notmuch_query_search threads in the
> > next patch.
> > ---
> >  lib/notmuch-private.h |    1 +
> >  lib/notmuch.h         |    8 ++++++-
> >  lib/query.cc          |   52 +++++++++++++++++++++++++++++++++++++++++++++---
> >  3 files changed, 56 insertions(+), 5 deletions(-)
> > 
> > diff --git a/lib/notmuch-private.h b/lib/notmuch-private.h
> > index 7bf153e..e791bb0 100644
> > --- a/lib/notmuch-private.h
> > +++ b/lib/notmuch-private.h
> > @@ -401,6 +401,7 @@ typedef struct _notmuch_message_list {
> >   */
> >  struct visible _notmuch_messages {
> >      notmuch_bool_t is_of_list_type;
> > +    notmuch_doc_id_set_t *excluded_doc_ids;
> 
> I might be following the diff wrong, but shouldn't this be a field of
> notmuch_mset_messages_t?  (Then it also doesn't have to be a pointer,
> which is really how notmuch_doc_id_set_t was designed to be used.)

I will need to think about that.

> >      notmuch_message_node_t *iterator;
> >  };
> >  
> > diff --git a/lib/notmuch.h b/lib/notmuch.h
> > index 7929fe7..740d005 100644
> > --- a/lib/notmuch.h
> > +++ b/lib/notmuch.h
> > @@ -449,6 +449,11 @@ typedef enum {
> >  const char *
> >  notmuch_query_get_query_string (notmuch_query_t *query);
> >  
> > +/* specify whether to results should omit the excluded results rather
> > + * than just marking them excluded */
> > +void
> > +notmuch_query_set_omit_excluded_messages (notmuch_query_t *query, notmuch_bool_t omit);
> > +
> 
> I don't think we should add this API.  The library behavior will not
> change for library users that don't use excludes and library users
> that do use excludes should by aware of the excluded flag and do the
> appropriate thing.
> 
> I can see why this is handy in some cases, but I don't think it
> provides enough utility to warrant becoming part of the permanent and
> minimal library interface.

This is really a performance improvement: suppose that there are lots of
threads that only match in excluded messages. Then without this flag we
will spend lots of time constructing the thread only for it to be
ignored. (In contrived situations this could be arbitrarily slower.)

Note the benchmarks were against master with the exclude code switched
off so that I was comparing the creation of the same threads. Sorry if I
didn't make that clear.

> >  /* Specify the sorting desired for this query. */
> >  void
> >  notmuch_query_set_sort (notmuch_query_t *query, notmuch_sort_t sort);
> > @@ -895,7 +900,8 @@ notmuch_message_get_filenames (notmuch_message_t *message);
> >  
> >  /* Message flags */
> >  typedef enum _notmuch_message_flag {
> > -    NOTMUCH_MESSAGE_FLAG_MATCH
> > +    NOTMUCH_MESSAGE_FLAG_MATCH,
> > +    NOTMUCH_MESSAGE_FLAG_EXCLUDED
> >  } notmuch_message_flag_t;
> >  
> >  /* Get a value of a flag for the email corresponding to 'message'. */
> > diff --git a/lib/query.cc b/lib/query.cc
> > index c25b301..7d165d2 100644
> > --- a/lib/query.cc
> > +++ b/lib/query.cc
> > @@ -28,6 +28,7 @@ struct _notmuch_query {
> >      const char *query_string;
> >      notmuch_sort_t sort;
> >      notmuch_string_list_t *exclude_terms;
> > +    notmuch_bool_t omit_excluded_messages;
> >  };
> >  
> >  typedef struct _notmuch_mset_messages {
> > @@ -57,6 +58,12 @@ struct visible _notmuch_threads {
> >      notmuch_doc_id_set_t match_set;
> >  };
> >  
> > +/* we need this in the message functions so forward declare */
> 
> Comments should start with a capital letter and end with a period.
> (The code isn't completely consistent about this, but it is something
> we're codifying in the upcoming style guide.)

Will fix

> > +static notmuch_bool_t
> > +_notmuch_doc_id_set_init (void *ctx,
> > +			  notmuch_doc_id_set_t *doc_ids,
> > +			  GArray *arr);
> > +
> >  notmuch_query_t *
> >  notmuch_query_create (notmuch_database_t *notmuch,
> >  		      const char *query_string)
> > @@ -79,6 +86,8 @@ notmuch_query_create (notmuch_database_t *notmuch,
> >  
> >      query->exclude_terms = _notmuch_string_list_create (query);
> >  
> > +    query->omit_excluded_messages = FALSE;
> > +
> >      return query;
> >  }
> >  
> > @@ -89,6 +98,12 @@ notmuch_query_get_query_string (notmuch_query_t *query)
> >  }
> >  
> >  void
> > +notmuch_query_set_omit_excluded_messages (notmuch_query_t *query, notmuch_bool_t omit)
> > +{
> > +    query->omit_excluded_messages = omit;
> > +}
> > +
> > +void
> >  notmuch_query_set_sort (notmuch_query_t *query, notmuch_sort_t sort)
> >  {
> >      query->sort = sort;
> > @@ -173,6 +188,7 @@ notmuch_query_search_messages (notmuch_query_t *query)
> >  						   "mail"));
> >  	Xapian::Query string_query, final_query, exclude_query;
> >  	Xapian::MSet mset;
> > +	Xapian::MSetIterator iterator;
> >  	unsigned int flags = (Xapian::QueryParser::FLAG_BOOLEAN |
> >  			      Xapian::QueryParser::FLAG_PHRASE |
> >  			      Xapian::QueryParser::FLAG_LOVEHATE |
> > @@ -190,11 +206,35 @@ notmuch_query_search_messages (notmuch_query_t *query)
> >  	    final_query = Xapian::Query (Xapian::Query::OP_AND,
> >  					 mail_query, string_query);
> >  	}
> > +	messages->base.excluded_doc_ids = NULL;
> > +
> > +	if (query->exclude_terms) {
> > +	    exclude_query = _notmuch_exclude_tags (query, final_query);
> > +	    exclude_query = Xapian::Query (Xapian::Query::OP_AND,
> > +					   exclude_query, final_query);
> > +
> > +	    if (query->omit_excluded_messages)
> > +		final_query = Xapian::Query (Xapian::Query::OP_AND_NOT,
> > +					     final_query, exclude_query);
> > +	    else {
> > +		enquire.set_weighting_scheme (Xapian::BoolWeight());
> > +		enquire.set_query (exclude_query);
> > +
> > +		mset = enquire.get_mset (0, notmuch->xapian_db->get_doccount ());
> > +
> > +		GArray *excluded_doc_ids = g_array_new (FALSE, FALSE, sizeof (unsigned int));
> > +
> > +		for (iterator = mset.begin (); iterator != mset.end (); iterator++)
> > +		{
> 
> No newline before the brace.

Will fix.

> 
> > +		    unsigned int doc_id = *iterator;
> > +		    g_array_append_val (excluded_doc_ids, doc_id);
> > +		}
> > +		messages->base.excluded_doc_ids = talloc (query, _notmuch_doc_id_set);
> > +		_notmuch_doc_id_set_init (query, messages->base.excluded_doc_ids,
> > +					  excluded_doc_ids);
> 
> Don't forget to g_array_unref excluded_doc_ids.

Yes I will add that.

> > +	    }
> > +	}
> >  
> > -	exclude_query = _notmuch_exclude_tags (query, final_query);
> > -
> > -	final_query = Xapian::Query (Xapian::Query::OP_AND_NOT,
> > -					 final_query, exclude_query);
> >  
> >  	enquire.set_weighting_scheme (Xapian::BoolWeight());
> >  
> > @@ -283,6 +323,10 @@ _notmuch_mset_messages_get (notmuch_messages_t *messages)
> >  	INTERNAL_ERROR ("a messages iterator contains a non-existent document ID.\n");
> >      }
> >  
> > +    if ((messages->excluded_doc_ids) &&
> > +	(_notmuch_doc_id_set_contains (messages->excluded_doc_ids, doc_id)))
> 
> No need for so many parens (just a nit).

Will fix.

> > +	notmuch_message_set_flag (message, NOTMUCH_MESSAGE_FLAG_EXCLUDED, TRUE);
> > +
> >      return message;
> >  }
> >  

Thanks

Mark


Thread: