Re: [PATCH 08/11] emacs: Support caching in notmuch-get-bodypart-{binary, text}

Subject: Re: [PATCH 08/11] emacs: Support caching in notmuch-get-bodypart-{binary, text}

Date: Sat, 24 Jan 2015 13:06:59 -0500

To: Mark Walters

Cc: notmuch@notmuchmail.org

From: Austin Clements


On Fri, 25 Apr 2014, Mark Walters <markwalters1009@gmail.com> wrote:
> On Thu, 24 Apr 2014, Austin Clements <amdragon@MIT.EDU> wrote:
>> Quoth Mark Walters on Apr 24 at 11:46 am:
>>> 
>>> On Mon, 21 Apr 2014, Austin Clements <amdragon@MIT.EDU> wrote:
>>> > (The actual code change here is small, but requires re-indenting
>>> > existing code.)
>>> > ---
>>> >  emacs/notmuch-lib.el | 52 ++++++++++++++++++++++++++++++----------------------
>>> >  1 file changed, 30 insertions(+), 22 deletions(-)
>>> >
>>> > diff --git a/emacs/notmuch-lib.el b/emacs/notmuch-lib.el
>>> > index fc67b14..fee8512 100644
>>> > --- a/emacs/notmuch-lib.el
>>> > +++ b/emacs/notmuch-lib.el
>>> > @@ -503,33 +503,39 @@ (defun notmuch-parts-filter-by-type (parts type)
>>> >     (lambda (part) (notmuch-match-content-type (plist-get part :content-type) type))
>>> >     parts))
>>> >  
>>> > -(defun notmuch-get-bodypart-binary (msg part process-crypto)
>>> > +(defun notmuch-get-bodypart-binary (msg part process-crypto &optional cache)
>>> >    "Return the unprocessed content of PART in MSG as a unibyte string.
>>> >  
>>> >  This returns the \"raw\" content of the given part after content
>>> >  transfer decoding, but with no further processing (see the
>>> >  discussion of --format=raw in man notmuch-show).  In particular,
>>> >  this does no charset conversion."
>>> > -  (let ((args `("show" "--format=raw"
>>> > -		,(format "--part=%d" (plist-get part :id))
>>> > -		,@(when process-crypto '("--decrypt"))
>>> > -		,(notmuch-id-to-query (plist-get msg :id)))))
>>> > -    (with-temp-buffer
>>> > -      ;; Emacs internally uses a UTF-8-like multibyte string
>>> > -      ;; representation by default (regardless of the coding system,
>>> > -      ;; which only affects how it goes from outside data to this
>>> > -      ;; internal representation).  This *almost* never matters.
>>> > -      ;; Annoyingly, it does matter if we use this data in an image
>>> > -      ;; descriptor, since Emacs will use its internal data buffer
>>> > -      ;; directly and this multibyte representation corrupts binary
>>> > -      ;; image formats.  Since the caller is asking for binary data, a
>>> > -      ;; unibyte string is a more appropriate representation anyway.
>>> > -      (set-buffer-multibyte nil)
>>> > -      (let ((coding-system-for-read 'no-conversion))
>>> > -	(apply #'call-process notmuch-command nil '(t nil) nil args)
>>> > -	(buffer-string)))))
>>> > -
>>> > -(defun notmuch-get-bodypart-text (msg part process-crypto)
>>> > +  (let ((data (plist-get part :binary-content)))
>>> > +    (when (not data)
>>> > +      (let ((args `("show" "--format=raw"
>>> > +		    ,(format "--part=%d" (plist-get part :id))
>>> > +		    ,@(when process-crypto '("--decrypt"))
>>> > +		    ,(notmuch-id-to-query (plist-get msg :id)))))
>>> > +	(with-temp-buffer
>>> > +	  ;; Emacs internally uses a UTF-8-like multibyte string
>>> > +	  ;; representation by default (regardless of the coding
>>> > +	  ;; system, which only affects how it goes from outside data
>>> > +	  ;; to this internal representation).  This *almost* never
>>> > +	  ;; matters.  Annoyingly, it does matter if we use this data
>>> > +	  ;; in an image descriptor, since Emacs will use its internal
>>> > +	  ;; data buffer directly and this multibyte representation
>>> > +	  ;; corrupts binary image formats.  Since the caller is
>>> > +	  ;; asking for binary data, a unibyte string is a more
>>> > +	  ;; appropriate representation anyway.
>>> > +	  (set-buffer-multibyte nil)
>>> > +	  (let ((coding-system-for-read 'no-conversion))
>>> > +	    (apply #'call-process notmuch-command nil '(t nil) nil args)
>>> > +	    (setq data (buffer-string)))))
>>> > +      (when cache
>>> > +	(plist-put part :binary-content data)))
>>> > +    data))
>>> 
>>> I am a little puzzled by this but that could be lack of familiarity with
>>> elisp. As far as I can see plist-put will sometimes modify the original
>>> plist and sometimes return a new plist. If the latter happens then I
>>> think it works out as if we hadn't cached anything as the part passed to
>>> the function is unmodified. That might not matter in this case (though I
>>> find the lack of determinism disturbing).
>>> 
>>> Or is something else going on?
>>
>> No, your familiarity with Elisp serves you well.  I'm completely
>> cheating here.  According to the specification of plist-put, it's
>> allowed to return a new list but in reality this only happens when the
>> original plist is nil.  We lean on this already all over the
>> notmuch-emacs code, but maybe that doesn't excuse me adding one more
>> cheat.
>>
>> I could add a comment here explaining what's going on, I could
>> manually do the list insertion in a way that's guaranteed to mutate it
>> in place, or I could add a nil :binary-content property when parts are
>> created (since plist-put is guaranteed to mutate existing keys in
>> place).
>
> I think a comment is fine. 

Done.

> (Incidentally what is the best way of telling if emacs has changed an
> object or returned a new one for other commands? Something like (setq
> oldobject object) (setq object (operation-on object)) (if (eq object
> oldobject) ... ))

If `eq' returns t, it definitely returned the original object, though it
may or may not have modified it.  If `eq' returns nil, there's no
guarantee that it *didn't* modify the object you passed in (maybe it
modified it and tacked a new cons on the beginning).  In short, there's
really no way of knowing.

> Also, I think the function should have a comment about the lifetime of
> the caching. I think in some cases the addition of :binary-content could
> occur on load and thus the plist with binary content added would get
> saved in the buffer when the msg plist was saved as a
> text-property. However, maybe in other cases this gets called after the
> initial insertion and thus the cached value is just used during this
> operation on msg? Sorry that is a little incoherent as I haven't checked
> all callers.

I believe the caching will always last as long as the buffer.  We only
have one instance of each message plist.  This plist goes in to the text
properties and caching works by mutating this same plist.

However, I updated the docstring to clarify that the cache is stored in
MSG.  These functions don't ultimately control the lifetime of MSG, but
this gives the caller enough information to know the lifetime of the
cache if it knows the lifetime of MSG.

> Best wishes
>
> Mark
>
>
>
>>> Best wishes
>>> 
>>> Mark
>>> 
>>> 
>>> 
>>> > +
>>> > +(defun notmuch-get-bodypart-text (msg part process-crypto &optional cache)
>>> >    "Return the text content of PART in MSG.
>>> >  
>>> >  This returns the content of the given part as a multibyte Lisp
>>> > @@ -546,7 +552,9 @@ (defun notmuch-get-bodypart-text (msg part process-crypto)
>>> >  	     (npart (apply #'notmuch-call-notmuch-sexp args)))
>>> >  	(setq content (plist-get npart :content))
>>> >  	(when (not content)
>>> > -	  (error "Internal error: No :content from %S" args))))
>>> > +	  (error "Internal error: No :content from %S" args)))
>>> > +      (when cache
>>> > +	(plist-put part :content content)))
>>> >      content))
>>> >  
>>> >  ;; Workaround: The call to `mm-display-part' below triggers a bug in

Thread: