Re: out of memory on idle machine (was: Re: consistent database corruption with notmuch new)

Subject: Re: out of memory on idle machine (was: Re: consistent database corruption with notmuch new)

Date: Sat, 30 Jan 2021 08:58:55 -0400

To: Gregor Zattler, notmuch

Cc:

From: David Bremner


Gregor Zattler <telegraph@gmx.net> writes:

> Hi notmuch developers,,
> * Gregor Zattler <telegraph@gmx.net> [14. Dez. 2020]:
>> notmuch new still corrupts the database, the second notmuch new
>> invocation finds emails the first did not find.
>
> I'm still searching for the reason notmuch chokes on my mails.

>
> I assembled a HP MicroServer, installed basic debian buster and
> notmuch from the debian buster repo, rsynced my mail to a
> separate file system symlinked to the same location as on my
> laptop.
>
> There are now
> grfz@mic:~/Mail$ find -type f | wc -l
> 1209419
> files on this file system.  no other process touches this
> file system, actually the machine is otherwise ilde.
>
> I did notmuch new several times in a row:
>
> grfz@mic:~/Mail/.notmuch$ rm -rf xapian
> grfz@mic:~/Mail/.notmuch$ notmuch new
> Welcome to a new version of notmuch! Your database will now be upgraded.
> This process is safe to interrupt.
> Backing up tags to /home/grfz/Mail/.notmuch/dump-20210127T114210.gz...
> Your notmuch database has now been upgraded.
> Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607947606.8134_1.no:2,
> Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607940473.9509_1.no:2,S
> Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607969276.21046_1.no:2,
> Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607987211.1395_1.no:2,
> Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607979988.4942_1.no:2,
> Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607972847.4857_1.no:2,
> Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607943993.24776_1.no:2,
> Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607976389.23296_1.no:2,
> Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607983586.19063_1.no:2,
> Note: Ignoring non-mail file: /home/grfz/Mail/drafts.mbox
> Note: Ignoring non-mail file: /home/grfz/Mail/postponed.mbox
> Processed 1183682 total files in 16h 43m 27s (19 files/sec.).
> Added 1091038 new messages to the database.
> grfz@mic:~/Mail/.notmuch$ notmuch new
> Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607947606.8134_1.no:2,
> Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607940473.9509_1.no:2,S
> Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607969276.21046_1.no:2,
> Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607987211.1395_1.no:2,
> Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607979988.4942_1.no:2,
> Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607972847.4857_1.no:2,
> Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607943993.24776_1.no:2,
> Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607976389.23296_1.no:2,
> Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607983586.19063_1.no:2,
> Processed 1169095 total files in 16h 52m 48s (19 files/sec.).
> Added 1077686 new messages to the database.

Idea #1
-------

There are several mysteries here, but maybe we should begin at the
beginning. Something is wrong if notmuch scans your entire mail tree the
second time you run notmuch new.

Notmuch checks the mtime of directories against the time stored in the
database. As a sanitity check, maybe you can do that for one of your
directories with many messages. This needs "quest" and "xapian-delve",
from the package xapian-tools.

Unfortunately this should probably be done after the first notmuch
new. I have another idea to try (below) in the state after several news where
you are getting OOM.

I'll use real paths for my system; you'll need to update them.

This gives a time in seconds

% stat --format "%Y" ~/Maildir/tethera/cur
1612008734

Now let us find the database document for that directory

% quest -bdir:XDIRECTORY -d ~/Maildir/.notmuch/xapian/ dir:tethera/cur

Parsed Query: Query(0 * XDIRECTORYtethera/cur)
Exactly 1 matches
MSet:
431067: [0]
tethera/cur

Grabbing the record number from the output of quest:

% xapian-delve -r 431067 -VS0 ~/Maildir/.notmuch/xapian

Value 0 for record #431067: 1.61201e+09
Term List for record #431067: XDDIRENTRY387045:cur XDIRECTORYtethera/cur

You can see the value matches the mtime up to 6 decimal places.

Idea #2
-------

Try to figure out if some specific file is causing the OOM.

Run notmuch-new in gdb

There is a check for NOTMUCH_STATUS_OUT_OF_MEMORY around line 419/420 of
notmuch-new.c. If I understand correctly, that is where things are
failing. The following is untested; you will need the package
notmuch-dbgsym installed [1]

% gdb --args notmuch new
(gdb) b notmuch-new.c:420
(gdb) run
(gdb) p filename



[1]: https://wiki.debian.org/AutomaticDebugPackages
_______________________________________________
notmuch mailing list -- notmuch@notmuchmail.org
To unsubscribe send an email to notmuch-leave@notmuchmail.org

Thread: