Quoth Mark Walters on Jul 27 at 10:35 am: > > Hi > > On Sun, 27 Jul 2014, Austin Clements <amdragon@MIT.EDU> wrote: > > Previously, the upgrade was organized as two passes -- an upgrade > > pass, and a separate cleanup pass -- so the database was always in a > > valid state. This change substantially simplifies this code by > > performing the upgrade in a transaction and combining both passes in > > to one. This 1) eliminates a lot of duplicate code between the > > passes, 2) speeds up the upgrade process, 3) makes progress reporting > > more accurate, 4) eliminates the potential for stale data if the > > upgrade is interrupted during the cleanup pass, and 5) makes it easier > > to reason about the safety of the upgrade code. > > I like this but I wonder if it has a side effect: I think with the > current code the user can interrupt the upgrade (ctrl-c) and continue > roughly where it left off. This looks like it means the whole upgrade > needs to be done in one go. Will this be a problem on large mail stores > (eg rlb with over 1M messages)? > > I am not sure what could be done during the interrupted upgrade before > so maybe this is not a problem. I haven't tested this hypothesis, but I don't think a partially completed upgrade would actually help upon restarting the upgrade. Since the old upgrade process couldn't safely remove terms/data until the end of the upgrade, if it were interrupted, the next upgrade would start right back at the beginning and do everything over again. Also, since the old upgrade code had to update the version number before removing old terms/data, if it was interrupted during the cleanup process the database would be left with cruft that would *never* be removed. With features we actually have a better chance of making partially completed upgrades useful: we could commit after each individual feature gets upgraded. Of course, that only helps when upgrade has multiple new features to upgrade to, so it may or may not be useful in practice depending on how quickly we add new features. > Best wishes > > Mark > > > > --- > > lib/database.cc | 67 ++++++--------------------------------------------------- > > 1 file changed, 7 insertions(+), 60 deletions(-) > > > > diff --git a/lib/database.cc b/lib/database.cc > > index 03eef3e..0be7180 100644 > > --- a/lib/database.cc > > +++ b/lib/database.cc > > @@ -1238,6 +1238,9 @@ notmuch_database_upgrade (notmuch_database_t *notmuch, > > timer_is_active = TRUE; > > } > > > > + /* Perform the upgrade in a transaction. */ > > + db->begin_transaction (true); > > + > > /* Before version 1, each message document had its filename in the > > * data field. Copy that into the new format by calling > > * notmuch_message_add_filename. > > @@ -1265,6 +1268,7 @@ notmuch_database_upgrade (notmuch_database_t *notmuch, > > filename = _notmuch_message_talloc_copy_data (message); > > if (filename && *filename != '\0') { > > _notmuch_message_add_filename (message, filename); > > + _notmuch_message_clear_data (message); > > _notmuch_message_sync (message); > > } > > talloc_free (filename); > > @@ -1312,6 +1316,8 @@ notmuch_database_upgrade (notmuch_database_t *notmuch, > > NOTMUCH_FIND_CREATE, &status); > > notmuch_directory_set_mtime (directory, mtime); > > notmuch_directory_destroy (directory); > > + > > + db->delete_document (*p); > > } > > } > > } > > @@ -1353,67 +1359,8 @@ notmuch_database_upgrade (notmuch_database_t *notmuch, > > notmuch->features |= NOTMUCH_FEATURES_CURRENT; > > db->set_metadata ("features", _print_features (local, notmuch->features)); > > db->set_metadata ("version", STRINGIFY (NOTMUCH_DATABASE_VERSION)); > > - db->flush (); > > - > > - /* Now that the upgrade is complete we can remove the old data > > - * and documents that are no longer needed. */ > > - if (version < 1) { > > - notmuch_query_t *query = notmuch_query_create (notmuch, ""); > > - notmuch_messages_t *messages; > > - notmuch_message_t *message; > > - char *filename; > > - > > - for (messages = notmuch_query_search_messages (query); > > - notmuch_messages_valid (messages); > > - notmuch_messages_move_to_next (messages)) > > - { > > - if (do_progress_notify) { > > - progress_notify (closure, (double) count / total); > > - do_progress_notify = 0; > > - } > > - > > - message = notmuch_messages_get (messages); > > - > > - filename = _notmuch_message_talloc_copy_data (message); > > - if (filename && *filename != '\0') { > > - _notmuch_message_clear_data (message); > > - _notmuch_message_sync (message); > > - } > > - talloc_free (filename); > > - > > - notmuch_message_destroy (message); > > - } > > > > - notmuch_query_destroy (query); > > - } > > - > > - if (version < 1) { > > - Xapian::TermIterator t, t_end; > > - > > - t_end = notmuch->xapian_db->allterms_end ("XTIMESTAMP"); > > - > > - for (t = notmuch->xapian_db->allterms_begin ("XTIMESTAMP"); > > - t != t_end; > > - t++) > > - { > > - Xapian::PostingIterator p, p_end; > > - std::string term = *t; > > - > > - p_end = notmuch->xapian_db->postlist_end (term); > > - > > - for (p = notmuch->xapian_db->postlist_begin (term); > > - p != p_end; > > - p++) > > - { > > - if (do_progress_notify) { > > - progress_notify (closure, (double) count / total); > > - do_progress_notify = 0; > > - } > > - > > - db->delete_document (*p); > > - } > > - } > > - } > > + db->commit_transaction (); > > > > if (timer_is_active) { > > /* Now stop the timer. */ > >