Thanks for the thorough review. My updated patches are on the eager-metadata-v4 branch (also, for-review/eager-metadata-v4) at http://awakening.csail.mit.edu/git/notmuch.git Responses inline. On Thu, Mar 10, 2011 at 10:48 PM, Carl Worth <cworth@cworth.org> wrote: > On Sun, 13 Feb 2011 15:25:48 -0500, Austin Clements <amdragon@mit.edu> wrote: >> I've rebased this against current master and fixed a few merge >> conflicts. The rebased patches are on the eager-metadata-v3 branch of >> http://awakening.csail.mit.edu/git/notmuch.git > > Hi Austin, > > This looks like a great set of optimizations and cleanup. Here is some > (long-overdue) review. > > First, the patch series looks excellent and my review here is quite > nit-picky. I feel bad reviewing this so late and not just immediately > merging it, but I will commit to picking up a refreshed series very > quickly. > > One very minor thing is that the word "metadata" might be confused with > Xapian metadata which we are using already such as: > > version_string = notmuch->xapian_db->get_metadata ("version"); > > So we might want to come up with a better name here. I don't have any > concrete suggestion yet, so if you don't think of anything obvious, then > don't worry about it. Hmm. I think this is okay because I always refer to "message metadata" (and I couldn't think of a better term). >> + /* Get tags */ >> + if (!message->tag_list) { >> + message->tag_list = >> + _notmuch_get_terms_with_prefix (message, i, end, tag_prefix); >> + _notmuch_string_list_sort (message->tag_list); >> + } > > Looks like the above case is missing the assert to ensure proper prefix > ordering. Fixed. >> + if (!message->tag_list) >> + _notmuch_message_ensure_metadata (message); >> + >> + /* We tell the iterator to add a talloc reference to the >> + * underlying list, rather than stealing it. As a result, it's >> + * possible to modify the message tags while iterating because >> + * the iterator will keep the current list alive. */ >> + return _notmuch_tags_create (message, message->tag_list, FALSE); > > The behavior here is great, but don't like Boolean function parameters > being used to change the semantics. The problem with a Boolean argument > is that it's impossible to know the semantic by just reading the calling > code---you must also consult the implementation as well. > > For any case where it seems natural to implement something with a > Boolean argument, I sometimes use an approach something like this: > > static void > _foo_function_internal (foo_t *, ..., bool_t be_different) > { > ...; > } > > void > foo_function (foo_t *, ...) > { > return _foo_function_internal (foo, ..., FALSE); > } > > void > foo_function_different (foo_t *, ...) > { > return _foo_function_internal (foo, ..., TRUE); > } > > That is, I'm willing to accept the Boolean parameter in the case of a > static function defined immediately next to all callers. Here, for any > non-static callers the semantics should be quite clear. (And perhaps the > function calling with the FALSE case will need a clarifying suffix as > well---one might have some_function_foo and some_function_bar for the > two cases). > > Of course, my proscription against Boolean parameter doesn't apply to > something like a function that is setting some Boolean feature to TRUE > or FALSE. The important criterion is that the function semantics should > be evident to someone reading code calling the function. So if the > Boolean argument is obviously tied to some portion of the function name, > then that can be adequate. > > Enough with the generalities, back to _notmuch_tags_create... > > The caller above is the exceptional caller. It's the only one using > passing FALSE, and it also already has a large comment. So it seems that > the right fix here is to have the extra talloc_reference happen here > where there's a comment talking about adding a talloc reference. > > So it would be great to see something here like: > > tags = _notmuch_tags_create (message, message->tag_list); > talloc_reference (message, message->tag_list); > return tags; > > Of course, that would mean that _notmuch_tags_create can't have done the > talloc_steal. So perhaps all the other callers should be calling: > > _notmuch_tags_create_with_steal (ctx, tag_list); > > What do you think? Your point about the boolean argument is well taken. In this case there's really no need for two difference versions, so I wound up making _notmuch_tags_create always steal the reference. I debated for a while whether it should add a reference, thus forcing most callers to talloc_unlink, or steal the reference, thus forcing one caller to talloc_reference. I ultimately decided it was much more likely that the caller would forget the talloc_unlink, resulting in a difficult-to-track memory leak (since the list would *eventually* be cleaned up), than that the steal would cause confusion. Also, there is some precedence for internal functions that steal an argument. So I made it steal the reference. This doesn't cause any problems in the one weird case that still needs the reference to the list. After the _notmuch_tags_create, the caller simply adds that reference. >> -notmuch_tags_t * >> -_notmuch_convert_tags (void *ctx, Xapian::TermIterator &i, >> - Xapian::TermIterator &end); >> +notmuch_string_list_t * >> +_notmuch_get_terms_with_prefix (void *ctx, Xapian::TermIterator &i, >> + Xapian::TermIterator &end, >> + const char *prefix); > > I don't really like the fact that _notmuch_get_terms_with_prefix doesn't > make a clear indication of what file it's defined in, (the old function > _notmuch_convert_tags had the same problem). It could be named > _notmuch_database_convert_tags since it's in database.cc, but that would > be odd in not actually accepting a notmuch_database_t * as the first > parameter. Any suggestion here? _notmuch_get_terms_with_prefix is weird because it captures a pattern that lies squarely in the Xapian world, being all about term iterators. Hence, I think the "right" solution (note, not the best solution), would be to hide the term iterators and make two copies of the function: one that takes a notmuch_database_t and always considers all database terms, and one private to message.cc that acts as a helper to what's now all bundled in _notmuch_message_ensure_metadata. But that bothers me more than a function that doesn't take a notmuch_database_t *. So I just added "database" to the name. >> index 0000000..d92a0ba >> --- /dev/null >> +++ b/lib/strings.c >> @@ -0,0 +1,94 @@ >> +/* strings.c - Iterator for a list of strings > > Similarly, this file might be better as string-list.c. Done. >> + * You should have received a copy of the GNU General Public License >> + * along with this program. If not, see http://www.gnu.org/licenses/ . >> + * >> + * Author: Carl Worth <cworth@cworth.org> > > Hey, wait a second, some of this code is mine, but I think the sort > function is yours. Please do start annotating the Author tags on all > files as appropriate. (There are probably lots of files missing your > Author attribution---I just happened to notice this one here since you > happened to have an Author line in the patch.) Heh, fixed. I suppose I hadn't been thinking about it, since none of the source in lib/ has other authors listed. But it raises an interesting question. When is it kosher to add oneself to a file's author list? In this case I wrote about half of the code in that admittedly short file, so that makes sense Changing a few lines presumably isn't enough. Adding a few functions? >> -/* XXX: Should write some talloc-friendly list to avoid the need for >> - * this. */ > > Hurrah! I love patches that get to remove XXX comments. ]:--8) >> + /* Get thread */ >> + if (!message->thread_id) >> + message->thread_id = >> + _notmuch_message_get_term (message, i, end, thread_prefix); >> + >> + /* Get id */ >> + assert (strcmp (thread_prefix, id_prefix) < 0); >> + if (!message->message_id) >> + message->message_id = >> + _notmuch_message_get_term (message, i, end, id_prefix); > > Missing asserts on the above two as well? Fixed. (Actually, I think there was only one assert missing in total, but it had propagated through the history.) > -Carl