Re: [PATCH 1/1] python/notmuch2: provide binding for database_get_directory()

Subject: Re: [PATCH 1/1] python/notmuch2: provide binding for database_get_directory()

Date: Sat, 31 Jul 2021 11:20:09 +0200

To: Ludovic LANGE, notmuch@notmuchmail.org

Cc:

From: Floris Bruynooghe


On Sat 31 Jul 2021 at 00:33 +0200, Ludovic LANGE wrote:
> (Btw, it seems your mail was sent directly to me, not the M/L, was it
> intentional or an accident ? Both are OK for me.)

No, that was accidental.  Apologies!

> While you review the new version of the patch, a few thoughts on your
> answers below:
>
> Le 27/07/2021 à 23:01, Floris Bruynooghe a écrit :
>>>>> +        :raises UpgradeRequiredError: The database must be upgraded
>>>>> +           first.
>>>>> +        """
>>>>> +        if not hasattr(os, 'PathLike') and isinstance(path, pathlib.Path):
>>>>> +            path = bytes(path)
>>>> I think `path = os.fspath(path)` should handle this fine.  But I don't
>>>> think you even need to do that as `os.fsencode()` will just do the right
>>>> thing already.
>>> This is a construct I saw multiple times in the same file (_database.py)
>>> of the notmuch2 python bindings. Not familiar with it, I guessed I
>>> should do the same also.
>>>
>>> I'm ok to remove it, but for consistency a more general approach should
>>> be taken for the rest of the ~6 times it occurs in this file. I'd be
>>> happy to have more insights on why you (as you seem to be the author of
>>> the new bindings, and of these constructs) had to put them here the
>>> first time ?
>> Heh, I guess I just didn't write that good code and read the docs better
>> this time round?  Pathlib was fairly new back then.  I think the other 6
>> times should probably change because it seems like these are the python
>> APIs to do this with and are less error prone.
>
> OK, that's fine - it's just that I don't have much experience with this
> PathLib API.
>
> I'm OK to remove this construct, no issue, will do it. I'll let you do
> the others or I can do it, as you wish.
>
>
>>>>> +class PurePathIter(FilenamesIter):
>>>>> +    """Iterator for pathlib.PurePath objects."""
>>>>> +
>>>>> +    def __next__(self):
>>>>> +        fname = super().__next__()
>>>>> +        return pathlib.PurePath(os.fsdecode(fname))
>>>> I'm surprised you explicitly need a PurePath here?
>>> Not needed. My reasoning was : a Directory is returning real-paths
>>> (directories, files) on the filesystem, but the API is only directed
>>> towards manipulating the notmuch database, not the underlying filesystem
>>> (otherwise we would have to re-run notmuch to synch the differences). So
>>> by returning a PurePath (as opposed to a concrete Path) it could be a
>>> signal to the user not to mess with the undeerlying filesystem.
>> I like this reasoning :)
>>
>> I wonder if we can go even further and not supply the `.files()` and
>> `.directories()` functions at all.  After all there is `Directory.path`
>> and in python you can just do `[p for p in Direcotry.path.iterdir() if
>> p.is_file()]` o get the equivalent.  OTOH supplying these allows you to
>> verify that notmuch sees the same things.
>>
>> So maybe it makes sense to keep this as PurePath and keep .files() and
>> .directories() but explain in their docstring they are more advanced and
>> point users to the list comprehension instead for most uses.
>>
>> Maybe someone who knows the C API better can comment on whether this
>> seems reasonable or whether I have this completely wrong.
>
> I'm a little embarrassed removing those functions (`.files()` and
> `.directories()`):
>
> * They're part of the C API, and in my vision the python API should be a
> (maybe agnostic) wrapper around the C API, not removing functions. I'm
> OK to be a little more "Pythonic", but I'd prefer not to diverge from
> the C roots.

This I'm not too swayed by, if you want a faithful API then you should
use the notmuch._capi module directly but you have to deal with all the
hurdles.  So I think it's reasonable to interpret things in a suitable
way.

> * In this specific use-case, asking the Database its vision of the
> filesystem is what I want - because I'm dealing with mails (that are in
> some folders) indexed by the database. If there is an inconsistency
> (folders not indexed for whatever reason) between the filesystem and
> notmuch's view ; I prefer to have notmuch's view because after I'll
> query notmuch based on those folders - they're better exist in the database.

Great, I missed this.  I checked the implementation in C and indeed it
queries the database and does not do any filesystem operations.  So yes,
it does make sense to include these.

Slightly off-topic: this is why I was reluctant to work on this API
myself, I haven't used it and don't fully understand it's uses.  So I
appreciate this discussion and you taking the time to explain me this
kind of thing.

> * Also, regarding PurePath, you're initial reaction was sound, and I'm
> OK to remove PurePath and have a (valid, functional) Path instead.
> That's what I put in the second version of the patch.


Cheers,
Floris
_______________________________________________
notmuch mailing list -- notmuch@notmuchmail.org
To unsubscribe send an email to notmuch-leave@notmuchmail.org

Thread: