Re: [PATCH 3/7] go: Allow notmuch objects to be garbage collected

Subject: Re: [PATCH 3/7] go: Allow notmuch objects to be garbage collected

Date: Thu, 19 Jul 2012 23:23:52 -0400

To: Adrien Bustany

Cc: notmuch@notmuchmail.org

From: Austin Clements


Quoth Adrien Bustany on Jul 19 at  9:25 pm:
> Le 18/07/2012 23:40, Austin Clements a écrit :
> >This is subtle enough that I think it deserves a comment in the source
> >code explaining that tracking the talloc owner reference, combined
> >with the fact that Go finalizers are run in dependency order, ensures
> >that the C objects will always be destroyed from the talloc leaves up.
> 
> Definitely, I tend to comment in the commit message and forget about
> the code...
> 
> >
> >Just one inline comment below.  Otherwise, I think this is all
> >correct.
> 
> Agree with the comment, the Database should be the parent. I guess I
> wasn't sure of the talloc parenting.
> 
> >
> >Is reproducing the talloc hierarchy in all of the bindings really the
> >right approach?  I feel like there has to be a better way (or that the
> >way we use talloc in the library is slightly broken).  What if the
> >bindings created an additional talloc reference to each managed
> >object, just to keep the object alive, and used talloc_unlink instead
> >of the destroy functions?
> 
> Reproducing the hierarchy is probably error prone, and not that
> simple indeed :/
> I haven't checked at all the way you suggest, but if we use
> talloc_reference/unlink, we get the same issue no?
> - If we do for each new wrapped object talloc_reference(NULL,
> wrapped_object), the the object will be kept alive until we
> talloc_unlink(NULL, wrapped_object), but what about its parents? For
> example will doing that on a notmuch_message_t keep the
> notmuch_messages_t alive?

Hmm.  This is what I was thinking.  You have an interesting point; I
think it's slightly wrong, but it exposes something deeper.  I believe
there are two different things going on here: some of the talloc
relationships are for convenience, while some are structural.  In the
former case, I'm pretty sure my suggestion will work, but in the
latter case the objects should *never* be freed by the finalizer!

For example, notmuch_query_search_messages returns a new
notmuch_messages_t with the query as the talloc parent, but that
notmuch_messages_t doesn't depend on the query object; this is just so
you can conveniently delete everything retrieved from the query by
deleting the query.  In this case, you can either use parent
references like you did---which will prevent a double-free by forcing
destruction to happen from the leaves up but at the cost of having to
encode these relationships and of extending the parent object
lifetimes beyond what's strictly necessary---or you can use my
suggestion of creating an additional talloc reference.

However, in your example, the notmuch_message_t's are structurally
related to the notmuch_messages_t from whence they came.  They're all
part of one data structure and hence it *never* makes sense for a
caller to delete the notmuch_message_t's.  For example, even with the
code in this patch, I think the following could lead to a crash:

1. Obtain a Messages object, say ms.
2. m1 := ms.Get()
3. m1 = nil
4. m2 := ms.Get()
5. m2.whatever()

If a garbage collection happens between steps 3 and 4, the Message in
m1 will get finalized and destroyed.  But step 4 will return the same,
now dangling, pointer, leading to a potential crash in step 5.

Maybe the answer in the structural case is to include the parent
pointer in the Go struct and not set a finalizer on the child?  That
way, if there's a Go reference to the parent wrapper, it won't go away
and the children won't get destroyed (collecting wrappers of children
is fine) and if there's a Go reference to the child wrapper, it will
keep the parent alive so it won't get destroyed and neither will the
child.

> - If we do talloc_reference(parent, wrapped), then we reproduce the
> hierarchy again?
> 
> Note that I have 0 experience with talloc, so I might as well be
> getting things wrong here.
> 
> >
> >Quoth Adrien Bustany on Jul 18 at  9:34 pm:
> >>This makes notmuch appropriately free the underlying notmuch C objects
> >>when garbage collecting their Go wrappers. To make sure we don't break
> >>the underlying links between objects (for example, a notmuch_messages_t
> >>being GC'ed before a notmuch_message_t belonging to it), we add for each
> >>wraper struct a pointer to the owner object (Go objects with a reference
> >>pointing to them don't get garbage collected).
> >>---
> >>  bindings/go/src/notmuch/notmuch.go |  153 +++++++++++++++++++++++++++++++-----
> >>  1 files changed, 134 insertions(+), 19 deletions(-)
> >>
> >>diff --git a/bindings/go/src/notmuch/notmuch.go b/bindings/go/src/notmuch/notmuch.go
> >>index 1d77fd2..3f436a0 100644
> >>--- a/bindings/go/src/notmuch/notmuch.go
> >>+++ b/bindings/go/src/notmuch/notmuch.go
> >>@@ -11,6 +11,7 @@ package notmuch
> >>  #include "notmuch.h"
> >>  */
> >>  import "C"
> >>+import "runtime"
> >>  import "unsafe"
> >>
> >>  // Status codes used for the return values of most functions
> >>@@ -47,40 +48,152 @@ func (self Status) String() string {
> >>  /* Various opaque data types. For each notmuch_<foo>_t see the various
> >>   * notmuch_<foo> functions below. */
> >>
> >>+type Object interface {}
> >>+
> >>  type Database struct {
> >>  	db *C.notmuch_database_t
> >>  }
> >>
> >>+func createDatabase(db *C.notmuch_database_t) *Database {
> >>+	self := &Database{db: db}
> >>+
> >>+	runtime.SetFinalizer(self, func(x *Database) {
> >>+		if (x.db != nil) {
> >>+			C.notmuch_database_destroy(x.db)
> >>+		}
> >>+	})
> >>+
> >>+	return self
> >>+}
> >>+
> >>  type Query struct {
> >>  	query *C.notmuch_query_t
> >>+	owner Object
> >>+}
> >>+
> >>+func createQuery(query *C.notmuch_query_t, owner Object) *Query {
> >>+	self := &Query{query: query, owner: owner}
> >>+
> >>+	runtime.SetFinalizer(self, func(x *Query) {
> >>+		if (x.query != nil) {
> >>+			C.notmuch_query_destroy(x.query)
> >>+		}
> >>+	})
> >>+
> >>+	return self
> >>  }
> >>
> >>  type Threads struct {
> >>  	threads *C.notmuch_threads_t
> >>+	owner Object
> >>+}
> >>+
> >>+func createThreads(threads *C.notmuch_threads_t, owner Object) *Threads {
> >>+	self := &Threads{threads: threads, owner: owner}
> >>+
> >>+	runtime.SetFinalizer(self, func(x *Threads) {
> >>+		if (x.threads != nil) {
> >>+			C.notmuch_threads_destroy(x.threads)
> >>+		}
> >>+	})
> >>+
> >>+	return self
> >>  }
> >>
> >>  type Thread struct {
> >>  	thread *C.notmuch_thread_t
> >>+	owner Object
> >>+}
> >>+
> >>+func createThread(thread *C.notmuch_thread_t, owner Object) *Thread {
> >>+	self := &Thread{thread: thread, owner: owner}
> >>+
> >>+	runtime.SetFinalizer(self, func(x *Thread) {
> >>+		if (x.thread != nil) {
> >>+			C.notmuch_thread_destroy(x.thread)
> >>+		}
> >>+	})
> >>+
> >>+	return self
> >>  }
> >>
> >>  type Messages struct {
> >>  	messages *C.notmuch_messages_t
> >>+	owner Object
> >>+}
> >>+
> >>+func createMessages(messages *C.notmuch_messages_t, owner Object) *Messages {
> >>+	self := &Messages{messages: messages, owner: owner}
> >>+
> >>+	return self
> >>  }
> >>
> >>  type Message struct {
> >>  	message *C.notmuch_message_t
> >>+	owner Object
> >>+}
> >>+
> >>+func createMessage(message *C.notmuch_message_t, owner Object) *Message {
> >>+	self := &Message{message: message, owner: owner}
> >>+
> >>+	runtime.SetFinalizer(self, func(x *Message) {
> >>+		if (x.message != nil) {
> >>+			C.notmuch_message_destroy(x.message)
> >>+		}
> >>+	})
> >>+
> >>+	return self
> >>  }
> >>
> >>  type Tags struct {
> >>  	tags *C.notmuch_tags_t
> >>+	owner Object
> >>+}
> >>+
> >>+func createTags(tags *C.notmuch_tags_t, owner Object) *Tags {
> >>+	self := &Tags{tags: tags, owner: owner}
> >>+
> >>+	runtime.SetFinalizer(self, func(x *Tags) {
> >>+		if (x.tags != nil) {
> >>+			C.notmuch_tags_destroy(x.tags)
> >>+		}
> >>+	})
> >>+
> >>+	return self
> >>  }
> >>
> >>  type Directory struct {
> >>  	dir *C.notmuch_directory_t
> >>+	owner Object
> >>+}
> >>+
> >>+func createDirectory(directory *C.notmuch_directory_t, owner Object) *Directory {
> >>+	self := &Directory{dir: directory, owner: owner}
> >>+
> >>+	runtime.SetFinalizer(self, func(x *Directory) {
> >>+		if (x.dir != nil) {
> >>+			C.notmuch_directory_destroy(x.dir)
> >>+		}
> >>+	})
> >>+
> >>+	return self
> >>  }
> >>
> >>  type Filenames struct {
> >>  	fnames *C.notmuch_filenames_t
> >>+	owner Object
> >>+}
> >>+
> >>+func createFilenames(filenames *C.notmuch_filenames_t, owner Object) *Filenames {
> >>+	self := &Filenames{fnames: filenames, owner: owner}
> >>+
> >>+	runtime.SetFinalizer(self, func(x *Filenames) {
> >>+		if (x.fnames != nil) {
> >>+			C.notmuch_filenames_destroy(x.fnames)
> >>+		}
> >>+	})
> >>+
> >>+	return self
> >>  }
> >>
> >>  type DatabaseMode C.notmuch_database_mode_t
> >>@@ -100,12 +213,13 @@ func NewDatabase(path string) (*Database, Status) {
> >>  		return nil, STATUS_OUT_OF_MEMORY
> >>  	}
> >>
> >>-	self := &Database{db: nil}
> >>-	st := Status(C.notmuch_database_create(c_path, &self.db))
> >>+	var db *C.notmuch_database_t;
> >>+	st := Status(C.notmuch_database_create(c_path, &db))
> >>  	if st != STATUS_SUCCESS {
> >>  		return nil, st
> >>  	}
> >>-	return self, st
> >>+
> >>+	return createDatabase(db), st
> >>  }
> >>
> >>  /* Open an existing notmuch database located at 'path'.
> >>@@ -134,12 +248,13 @@ func OpenDatabase(path string, mode DatabaseMode) (*Database, Status) {
> >>  		return nil, STATUS_OUT_OF_MEMORY
> >>  	}
> >>
> >>-	self := &Database{db: nil}
> >>-	st := Status(C.notmuch_database_open(c_path, C.notmuch_database_mode_t(mode), &self.db))
> >>+	var db *C.notmuch_database_t;
> >>+	st := Status(C.notmuch_database_open(c_path, C.notmuch_database_mode_t(mode), &db))
> >>  	if st != STATUS_SUCCESS {
> >>  		return nil, st
> >>  	}
> >>-	return self, st
> >>+
> >>+	return createDatabase(db), st
> >>  }
> >>
> >>  /* Close the given notmuch database, freeing all associated
> >>@@ -204,7 +319,7 @@ func (self *Database) GetDirectory(path string) (*Directory, Status) {
> >>  	if st != STATUS_SUCCESS || c_dir == nil {
> >>  		return nil, st
> >>  	}
> >>-	return &Directory{dir: c_dir}, st
> >>+	return createDirectory(c_dir, nil), st
> >
> >It looks like you have a nil owner for anything whose talloc parent is
> >the database.  Is this intentional?  Shouldn't the owner be self in
> >these cases, too?
> >
> >>  }
> >>
> >>  /* Add a new message to the given notmuch database.
> >>@@ -258,7 +373,7 @@ func (self *Database) AddMessage(fname string) (*Message, Status) {
> >>  	var c_msg *C.notmuch_message_t = new(C.notmuch_message_t)
> >>  	st := Status(C.notmuch_database_add_message(self.db, c_fname, &c_msg))
> >>
> >>-	return &Message{message: c_msg}, st
> >>+	return createMessage(c_msg, nil), st
> >>  }
> >>
> >>  /* Remove a message from the given notmuch database.
> >>@@ -319,12 +434,12 @@ func (self *Database) FindMessage(message_id string) (*Message, Status) {
> >>  		return nil, STATUS_OUT_OF_MEMORY
> >>  	}
> >>
> >>-	msg := &Message{message: nil}
> >>-	st := Status(C.notmuch_database_find_message(self.db, c_msg_id, &msg.message))
> >>+	var msg *C.notmuch_message_t
> >>+	st := Status(C.notmuch_database_find_message(self.db, c_msg_id, &msg))
> >>  	if st != STATUS_SUCCESS {
> >>  		return nil, st
> >>  	}
> >>-	return msg, st
> >>+	return createMessage(msg, nil), st
> >>  }
> >>
> >>  /* Return a list of all tags found in the database.
> >>@@ -339,7 +454,7 @@ func (self *Database) GetAllTags() *Tags {
> >>  	if tags == nil {
> >>  		return nil
> >>  	}
> >>-	return &Tags{tags: tags}
> >>+	return createTags(tags, nil)
> >>  }
> >>
> >>  /* Create a new query for 'database'.
> >>@@ -379,7 +494,7 @@ func (self *Database) CreateQuery(query string) *Query {
> >>  	if q == nil {
> >>  		return nil
> >>  	}
> >>-	return &Query{query: q}
> >>+	return createQuery(q, nil)
> >>  }
> >>
> >>  /* Sort values for notmuch_query_set_sort */
> >>@@ -459,7 +574,7 @@ func (self *Query) SearchThreads() *Threads {
> >>  	if threads == nil {
> >>  		return nil
> >>  	}
> >>-	return &Threads{threads: threads}
> >>+	return createThreads(threads, self)
> >>  }
> >>
> >>  /* Execute a query for messages, returning a notmuch_messages_t object
> >>@@ -505,7 +620,7 @@ func (self *Query) SearchMessages() *Messages {
> >>  	if msgs == nil {
> >>  		return nil
> >>  	}
> >>-	return &Messages{messages: msgs}
> >>+	return createMessages(msgs, self)
> >>  }
> >>
> >>  /* Destroy a notmuch_query_t along with any associated resources.
> >>@@ -607,7 +722,7 @@ func (self *Messages) Get() *Message {
> >>  	if msg == nil {
> >>  		return nil
> >>  	}
> >>-	return &Message{message: msg}
> >>+	return createMessage(msg, self)
> >>  }
> >>
> >>  /* Move the 'messages' iterator to the next message.
> >>@@ -659,7 +774,7 @@ func (self *Messages) CollectTags() *Tags {
> >>  	if tags == nil {
> >>  		return nil
> >>  	}
> >>-	return &Tags{tags: tags}
> >>+	return createTags(tags, self)
> >>  }
> >>
> >>  /* Get the message ID of 'message'.
> >>@@ -739,7 +854,7 @@ func (self *Message) GetReplies() *Messages {
> >>  	if msgs == nil {
> >>  		return nil
> >>  	}
> >>-	return &Messages{messages: msgs}
> >>+	return createMessages(msgs, self)
> >>  }
> >>
> >>  /* Get a filename for the email corresponding to 'message'.
> >>@@ -871,7 +986,7 @@ func (self *Message) GetTags() *Tags {
> >>  	if tags == nil {
> >>  		return nil
> >>  	}
> >>-	return &Tags{tags: tags}
> >>+	return createTags(tags, self)
> >>  }
> >>
> >>  /* The longest possible tag value. */

Thread: