On Sun, 02 Jun 2013, Jameson Graef Rollins <jrollins@finestructure.net> wrote: > On Sat, Jun 01 2013, David Bremner <david@tethera.net> wrote: >> Austin Clements <amdragon@MIT.EDU> writes: >> >>> This is v3 of id:1369934016-22308-1-git-send-email-amdragon@mit.edu. >>> This tweaks the shell invocation as suggested by Tomi and fixes two >>> comment typos pointed out by Mark. It also adds a NEWS patch. I'm >>> going to go ahead and mark this ready because of Tomi's and Mark's >>> reviews of v2. >> >> The first 5 I pushed. The NEWS patch has a conflict. > > I'm very happy to see the long-coming sexp handling working here. Good > work, folks, particularly to Austin for getting the awesome asynchronous > processing stuff working. Searches are now definitely noticeably > faster. > > I am, however, seeing a couple of issues that we might want to address. > > * Killing a search buffer that is still in the process of being filled > causes errors to be thrown. I'm seeing both of the following > intermittently: > > [Sun Jun 2 08:26:40 2013] > notmuch exited with status killed > command: notmuch search --format\=sexp --format-version\=1 --sort\=newest-first to\:jrollins > exit signal: killed > > [Sun Jun 2 08:32:26 2013] > notmuch exited with status hangup > command: notmuch search --format\=sexp --format-version\=1 --sort\=newest-first to\:jrollins > exit signal: hangup > > This is somewhat understandable, as the notmuch binary exits with an > error if it hasn't finished dumping the output, but given how common > this particular scenario is I think we should try to avoid throwing > errors in this circumstance. I wonder if we shouldn't just modify the > binary to not return non-zero if it was manually killed while > processing the output, or at least special-case the particular error > caused by manually killing the search. Your assessment is correct, of course. The right place to fix this is in Emacs, not the CLI (the CLI *can't* do anything about this, since it gets killed by a signal). Probably we should do something different in the sentinel if the search process's buffer is no longer live. Clearly we should suppress the status error for the signal, but I think we still should report anything that appeared in err-file because it may be relevant to why the user killed the buffer (e.g., maybe a notmuch wrapper was blocked on something). > * The next thing I'm seeing is this: > > Opening input file: no such file or directory, /home/jrollins/tmp/nmerr5390CAY > > I'm not exactly sure what causes this error, but it looks to me like > the temporary error file was removed before we were finished with it. This one's pretty awesome (and I think is a bug in Emacs). At a high level, the sentinel is getting run twice and since the first call deletes the error file, the second call fails. At a low level, what causes this is fascinating. 1) You kill the search buffer. This invokes kill_buffer_processes, which sends a SIGHUP to notmuch, but doesn't do anything else. Meanwhile, the notmuch search process has printed some more output, but Emacs hasn't consumed it yet (this is critical). 2) Emacs gets a SIGCHLD from the dying notmuch process, which invokes handle_child_signal, which sets the new process status, but can't do anything else because it's a signal handler. 3) Emacs returns to its idle loop, which calls status_notify, which sees that the notmuch process has a new status. This is where things get interesting. 3.1) Emacs guarantees that it will run process filters on any unconsumed output before running the process sentinel, so status_notify calls read_process_output, which consumes the final output and calls notmuch-search-process-filter. 3.1.1) notmuch-search-process-filter contains code to check if the search buffer is still alive and, since it's not, it calls delete-process. 3.1.1.1) delete-process correctly sees that the process is already dead and doesn't try to send another signal, *but* it still modifies the status to "killed". To deal with the new status, it calls status_notify. Dun dun dun. We've seen this function before. 3.1.1.1.1) The *recursive* status_notify invocation sees that the process has a new status and doesn't have any more output to consume, so it invokes our sentinel and returns. 3.2) The outer status_notify call (which we're still in) is now done flushing pending process output, so it *also* invokes our sentinel. It might be that the answer is to just remove the delete-process call from the filter. It seems completely redundant (and racy) with Emacs' automatic SIGHUP'ing. > * Finally, something happened that caused *12,000* of the following lines > to be sent to the *Notmuch errors* buffer: > > A Xapian exception occurred performing query: The revision being read has been discarded - you should call Xapian::Database::reopen() and retry the operation > > Again, this was related to killing a search buffer that was still > being filled. I'm pretty sure the database was not modified during > this process. I have no insight on this one. My best guess is that this has nothing to do with this change except that this change makes these warnings visible rather than burying them somewhere down in the search results buffer. > Let me know if I can help provide any more info. > > jamie.