I profiled it, but nothing jumped out to me. Here's the code I've used: import notmuch2 import timeit def msg2_threads(): db = notmuch2.Database() ts = db.threads("date:2025") for t in ts: authors = {} tags = {} for msg in t: authors[msg.header("from")] = 1 for tag in msg.tags: tags[tag] = 1 author_list = list(authors.keys()) tag_list = list(tags.keys()) db.close() def msg2_messages(): db = notmuch2.Database() ts = db.messages("date:2025") threads = {} for msg in ts: threads[msg.threadid] = 1 authors = {} tags = {} for msg in db.messages(" or ".join([f"thread:{t}" for t in list(threads.keys())])): authors[msg.header("from")] = 1 for tag in msg.tags: tags[tag] = 1 author_list = list(authors.keys()) tag_list = list(tags.keys()) db.close() print(timeit.timeit(msg2_threads, number=10)) print(timeit.timeit(msg2_messages, number=10)) The second function takes *half* the time of the first on my machine, even though they both get the same messages. Cheers, Lars On Sun, 09 Feb 2025 14:56:42 -0400, David Bremner <david@tethera.net> wrote: > Lars Kotthoff <lists@larsko.org> writes: > > > On a somewhat related note, I've noticed that getting getting threads is much > > slower than getting messages that match the same query, extracting the thread > > IDs, and then getting the messages for each of those threads. This seems to be > > the case both in the old and new APIs — any ideas? > > > > Retrieving threads with C-API does a fair amount of work, so it might be > worth running under perf and seeing if there is a common hotspot in the > notmuch library. > > d > [...] _______________________________________________ notmuch mailing list -- notmuch@notmuchmail.org To unsubscribe send an email to notmuch-leave@notmuchmail.org