Quoth Jason A. Donenfeld on Dec 13 at 3:32 pm: > On Wed, Dec 12, 2012 at 9:49 PM, Austin Clements <amdragon@mit.edu> wrote: > > There should be no way to corrupt the database at this level through > > the Xapian API, which means nothing libnotmuch can do (much less users > > of libnotmuch) should be able to corrupt the database. If you can > > reproduce the problem, it's probably a serious bug in Xapian, but it > > could also have been a file system bug or even random file system > > corruption. > > Well that's... troubling. > > Patrick: could you please backup and try to reproduce? Otherwise I'll > assume this was a one-off situation. > > > Austin-- think you could do a quick review of the script to double > check and confirm I'm not doing anything nefarious? > http://git.zx2c4.com/gmail-notmuch/tree/gmail-notmuch.py In theory the only way you could cause corruption besides tickling a bug would be to access the same database object concurrently from different threads (since it's not thread-safe), but you don't appear to be doing that. I did spot something that could corrupt delivered email, though. The way you deliver to the Maildir is resilient to process termination, but not to system failures such as power outages. In particular, you need to at least os.fsync before the os.link. I'd recommend looking at Python's mailbox module, which has a robust Maildir delivery implementation (though it appears it doesn't let you control the file name, so you probably can't use it directly).