Excerpts from martin f krafft's message of Wed Feb 17 22:46:13 -0500 2010: > You ought to have sent to the list, and I want to send mine there > too, so please give permission. > Oops! Sorry about that. Damn you sup. > also sprach Ben Gamari <bgamari@gmail.com> [2010.02.18.1620 +1300]: > > This is a very good point. From what I've read about the database > > format, I can't think of any way that reverse dependencies could be > > easily found, unfortunately. If there really is no way to do this, then > > we could have a problem. I'm not sure rewriting tens of megabytes > > everytime you receive a mail message is acceptable. > > You would not need to do that, since the messages don't change, and > thus their blobs remain the same. I believe you would. The problem isn't the messages (well, that's a problem too), it's the fact that the tree (e.g. tab) objects which reference the messages are immutable (I believe). This presents us with the difficult circumstance of being unable to modify a tag after it has been created. Therefore, as far as I can tell, we need to rewrite the tag's tree object whenever we add or remove a message. This was the reason I suggested nesting tag trees, although this only partially solves the issue. (Please correct me if I'm wrong about any/all of the above) > > However, for every manipulation of a message, you would need to > iterate *all* tag trees (O(n)) and update the ones referencing the > message (also O(n)). > This is definitely an issue. > This can probably be further optimised, but still: it's not quite as > nice as enumerating all parents of a message in O(1) time (which > would still result in O(m×n)). > Yeah, I'm not sure how well this would scale on truly massive mail stores. Cheers, - Ben