Re: [PATCH 1/2] python/notmuch2: do not destroy messages owned by a query

Subject:Re: [PATCH 1/2] python/notmuch2: do not destroy messages owned by a query

Date:Fri, 22 May 2020 00:17:38 +0200

To:Anton Khirnov ,notmuch@notmuchmail.org

Cc:

From:Floris Bruynooghe


Hi Anton,

Thanks for improving the bindings!  Any my apologies for the late
response, I failed to spot this mail the first time round.

Also, this is a pretty serious bug, thanks for finding it.

This looks pretty solid, a few small style comments that aren't very
important notwithstanding.  Though I do think you should be able to add
a test for this in the pytest tests.  I think just triggering the
use-after-free you demonstrate in the commit message is fine as it will
work with your patch.  The fact it will crash rather then fail without
is unfortunate but probably fine.

On Sat 09 May 2020 at 07:05 +0200, Anton Khirnov wrote:

> Any messages retrieved from a query - either directly via
> search_messages() or indirectly via thread objects - are owned by that
> query. Retrieving the same message (i.e. corresponding to the same
> message ID / database object) several times will always yield the same
> C object.
>
> The caller is allowed to destroy message objects owned by a query before
> the query itself - which can save memory for long-lived queries.
> However, that message must then never be retrieved again from that
> query.
>
> The python-notmuch2 bindings will currently destroy every message object
> in Message._destroy(), which will lead to an invalid free if the same
> message is then retrieved again. E.g. the following python program leads
> to libtalloc abort()ing:
>
> import notmuch2
> db   = notmuch2.Database(mode = notmuch2.Database.MODE.READ_ONLY)
> t    = next(db.threads('*'))
> msgs = list(zip(t.toplevel(), t.toplevel()))
> msgs = list(zip(t.toplevel(), t.toplevel()))
>
> Fix this issue by creating a subclass of Message, which is used for
> "standalone" message which have to be freed by the caller. Message class
> is then used only for messages descended from a query, which do not need
> to be freed by the caller.
> ---
>  bindings/python-cffi/notmuch2/_database.py |  6 ++--
>  bindings/python-cffi/notmuch2/_message.py  | 42 ++++++++++++++--------
>  2 files changed, 30 insertions(+), 18 deletions(-)
>
> diff --git a/bindings/python-cffi/notmuch2/_database.py b/bindings/python-cffi/notmuch2/_database.py
> index 95f59ca0..f14eac78 100644
> --- a/bindings/python-cffi/notmuch2/_database.py
> +++ b/bindings/python-cffi/notmuch2/_database.py
> @@ -399,7 +399,7 @@ class Database(base.NotmuchObject):
>                capi.lib.NOTMUCH_STATUS_DUPLICATE_MESSAGE_ID]
>          if ret not in ok:
>              raise errors.NotmuchError(ret)
> -        msg = message.Message(self, msg_pp[0], db=self)
> +        msg = message.StandaloneMessage(self, msg_pp[0], db=self)
>          if sync_flags:
>              msg.tags.from_maildir_flags()
>          return self.AddedMessage(
> @@ -468,7 +468,7 @@ class Database(base.NotmuchObject):
>          msg_p = msg_pp[0]
>          if msg_p == capi.ffi.NULL:
>              raise LookupError
> -        msg = message.Message(self, msg_p, db=self)
> +        msg = message.StandaloneMessage(self, msg_p, db=self)
>          return msg
>  
>      def get(self, filename):
> @@ -501,7 +501,7 @@ class Database(base.NotmuchObject):
>          msg_p = msg_pp[0]
>          if msg_p == capi.ffi.NULL:
>              raise LookupError
> -        msg = message.Message(self, msg_p, db=self)
> +        msg = message.StandaloneMessage(self, msg_p, db=self)
>          return msg
>  
>      @property
> diff --git a/bindings/python-cffi/notmuch2/_message.py b/bindings/python-cffi/notmuch2/_message.py
> index c5fdbf6d..416ce7ca 100644
> --- a/bindings/python-cffi/notmuch2/_message.py
> +++ b/bindings/python-cffi/notmuch2/_message.py
> @@ -14,7 +14,7 @@ __all__ = ['Message']
>  
>  
>  class Message(base.NotmuchObject):
> -    """An email message stored in the notmuch database.
> +    """An email message stored in the notmuch database retrieved via a query.
>  
>      This should not be directly created, instead it will be returned
>      by calling methods on :class:`Database`.  A message keeps a
> @@ -61,22 +61,10 @@ class Message(base.NotmuchObject):
>  
>      @property
>      def alive(self):
> -        if not self._parent.alive:
> -            return False
> -        try:
> -            self._msg_p
> -        except errors.ObjectDestroyedError:
> -            return False
> -        else:
> -            return True
> -
> -    def __del__(self):
> -        self._destroy()
> +        return self._parent.alive
>  
>      def _destroy(self):
> -        if self.alive:
> -            capi.lib.notmuch_message_destroy(self._msg_p)
> -        self._msg_p = None
> +        pass

I feel like some more description from the commit message could live
here to explain why this isn't destroying the message rather than having
to find it in the commit message.

>  
>      @property
>      def messageid(self):
> @@ -375,6 +363,30 @@ class Message(base.NotmuchObject):
>          if isinstance(other, self.__class__):
>              return self.messageid == other.messageid
>  
> +class StandaloneMessage(Message):

Minor style thing, between top-level code one usually leaves 2 blank
lines per PEP8.  Nowadays perhaps we should think about adopting an
automatic formatter but hey.

(This applies to lots of places in both patches btw)

> +    """An email message stored in the notmuch database.
> +
> +    This subclass of Message is used for messages that are retrieved from the
> +    database directly and are not owned by a query.
> +    """

Likewise I'd be tempted to add here that this message is destroyed as
soon as it is garbage collected pointing out that this is not the case
for Message.  And probably update the Message docstring with the inverse.

> +    @property
> +    def alive(self):
> +        if not self._parent.alive:
> +            return False
> +        try:
> +            self._msg_p
> +        except errors.ObjectDestroyedError:
> +            return False
> +        else:
> +            return True
> +
> +    def __del__(self):
> +        self._destroy()
> +
> +    def _destroy(self):
> +        if self.alive:
> +            capi.lib.notmuch_message_destroy(self._msg_p)
> +        self._msg_p = None
>  
>  class FilenamesIter(base.NotmuchIter):
>      """Iterator for binary filenames objects."""
_______________________________________________
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch

Thread: