git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

pathlib


On 1/10/19 1:40 AM, Barry Scott wrote:
> 
> 
>> On 30 Sep 2019, at 12:51, Dan Sommers <2QdxY4RzWzUUiLuE at potatochowder.com> wrote:
>>
>> On 9/30/19 4:28 AM, Barry Scott wrote:
>>>> On 30 Sep 2019, at 05:40, DL Neil via Python-list <python-list at python.org> wrote:
>>>> Should pathlib reflect changes it has made to the file-system?
>>> I think it should not.
>>> A Path() is the name of a file it is not the file itself. Why should it
>>> track changes in the file system for the name?
>>
>> I would have said the same thing, but the docs? disagree:  a
>> PurePath represents the name of (or the path to) a file, but a
>> Path represents the actual file.
> 
> I'm not seeing that wording in the python 3.7 pathlib documentation.
> Can you quote the exact wording please?
> 
> I do see this:
> 
> "Pure path objects provide path-handling operations which don?t actually access a filesystem."
> 
> And:
> 
> "Concrete paths are subclasses of the pure path classes. In addition to operations provided
> by the latter, they also provide methods to do system calls on path objects."
> 
> There is no requirement that a Path() names a file that exists even.


That is true - of a "path".

However, as soon as said path is applied to the file system, eg .open( 
"r" ) it had better exist or we will disappear into OSError-country.

If we use .open( "w" ) then the very real file is created within the FS, 
at that path.

So, the file does not need to exist for a path to be legal - which is a 
(to me sounds like) PurePath.

However, don't we have to suspend such 'purity' in the "concrete" classes...

>> That said, why doesn't your argument apply to read and write?  I
>> would certainly expect that writing to a path and then reading
>> from that same path would return the newly written data.  If I
>> squint funny, the Path object is tracking the operations on the
>> file system.
> 
> I do not expect that. Consider the time line:
> 
> 1. with p.open('w') write data
> 2. external process changes file on disk
> 3. with p.open('r') read data
> 
> How would (3) get the data written at (1) guaranteed?
> It will lead to bugs to assume that.
> 
> The path object is allowing system calls that need a file's path to be called,
> that is all. Beyond that there is no relationship between the pathlib.Path()
> objects and files.

In theory one can separate paths and files. In practice, if Python is to 
access a file it must pass a path to the OpSys/FS.

To be fair, the temporal argument (race condition) applies to any IO 
method unless "locking" is available (it is not, either in pathlib or 
the os and sys libraries.

You cannot .open( 'r' ) unless the actual file exists.

You (probably) won't want to .open( 'w' ) if the actual file exists.

Wouldn't we like a library which helps, in both cases!


>> I think I'm actually arguing against some long since made (and
>> forgotten?) design decisions that can't be changed (dare I say
>> fixed?) because changing them would break backwards
>> compatibility.

+1


>> Yuck.  :-)  And I can absolutely see all sorts of different
>> expecations not being met and having to be explained by saying
>> "well, that's the way it works."
> 
> I'd suggest that the design is reasonable and If there is misunderstanding that its
> something that docs could address.

Design: +1
Misunderstanding: +1
Docs: +1

Perhaps pathlib encourages one/some to believe that it has been designed 
to do more than it does/set out to do?

Perhaps it does more than it should (by implementing some closer 
relationships with the underlying file system, but not others)?

Perhaps it makes sense for .name to be a data-attribute in the Pure 
classes, but it should be name(), ie a method-attribute in the concrete 
classes? (per other methods subject to change/mutation - see .stat() 
examples in earlier responses)
-- 
Regards =dn