On Thu, Jan 05, 2012 at 04:04:22PM +0100, Thomas Jost wrote: > On Wed, 04 Jan 2012 15:49:19 +0000, boyska <piuttosto@logorroici.org> wrote: > > Hello! > > I like notmuch a lot, so I'm writing a (conceptually) similar software > > about addressbook: it will scan all your emails, storing email > > addresses > > in a xapian database (you can think of it as little brother database[1] > > on > > steroids) > > The part that I'd like to re-implement is "notmuch new": it seems that > > in the xapian db there is not only informations about each mail, but > > also the mtime of each directory. My impression is this being > > "chaotic", > > but probably I am just missing the point. > > > > So, here's the question: how is the db "structured"? is there any > > documentation to look at? > > > > [1] http://www.spinnaker.de/lbdb/ > > > > -- > > boyska > > GPG: 0x520CE393 > > There's a description of the DB "schema" in lib/database.cc in the > notmuch source code. But you may also consider just using libnotmuch > instead, if that's enough for what you want to do. thanks, found it, much clearer now. But I really can't understand why not just putting these things on a separate file :) atomic consistency issues? > Also: why Xapian? I'm already using something similar I wrote with > Python, storing everything in a dictionary, using Pickle to save that to > disk: 162 lines of code and 45 kb of data are enough to store my > addressbook and have completion in Emacs... dictionary approach is fine to manage a "manual" addressbook, where you store addresses. But what I want is an _automatic_ addressbook, like the lbdb one, which just indexes all seen emails. The grep approach is better from this point of view, but still not advanced enough for me. For example, I'd like to store "cooccorrences": if some email is used in the same mail of some other, then it must contain a relationship; for example, your email should be correlated to the notmuch mailinglist, because you wrote to it. (they should be 0-weighted xapian term). Also, I want to give more importance to email addresses which are frequently seen, and much less to not-so-frequently seen. Xapian makes these really easy, so the question is "why not using it?" ;)