git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

pathlib


On 1/10/19 1:09 AM, Chris Angelico wrote:
> On Mon, Sep 30, 2019 at 9:54 PM Dan Sommers
> <2QdxY4RzWzUUiLuE at potatochowder.com> wrote:
>> I would have said the same thing, but the docs? disagree:  a
>> PurePath represents the name of (or the path to) a file, but a
>> Path represents the actual file.
>>
>>
>> ? https://docs.python.org/3/library/pathlib.html
> 
> I don't think it represents the actual file. If it did, equality would
> be defined by samefile, NOT by the file name.
> 
>>>> from pathlib import Path
>>>> import os
>>>> open("file1", "w").close()
>>>> os.link("file1", "file2")
>>>> Path("file1") == Path("file2")
> False
>>>> Path("file1").samefile(Path("file2"))
> True
>>>> Path("file1") == Path("file1")
> True


This example involves a "hard link" in Linux - can't recall if it 
applies under MS-Win...

On a Linux system the 'two' files share the same inode, and thus the 
same 'patch of disk', but one can be deleted without affecting the 
other. Thus, borrowing from the above snippet:

 >>> import os
 >>> open("file1", "w").close()
 >>> os.link("file1", "file2")
 >>> f1 = pathlib.Path( "file1" )
 >>> f2 = pathlib.Path( "file2" )
 >>> f1 == f2
False
### they are not the same path (by name) - due only to the last character

 >>> f1.stat()
os.stat_result(st_mode=33204, st_ino=1049466, st_dev=64769, st_nlink=2, 
st_uid=1000, st_gid=1000, st_size=0, st_atime=1569903409, 
st_mtime=1569903409, st_ctime=1569903410)
 >>> f2.stat()
os.stat_result(st_mode=33204, st_ino=1049466, st_dev=64769, st_nlink=2, 
st_uid=1000, st_gid=1000, st_size=0, st_atime=1569903409, 
st_mtime=1569903409, st_ctime=1569903410)
 >>> f1.samefile( f2 )
True
### but they do 'label' the same file when the path is applied to the FS.

Let's try something similar, but the other-way-around:

 >>> f1 = pathlib.Path( "file1" )
 >>> f1.stat()
os.stat_result(st_mode=33204, st_ino=1049466, st_dev=64769, st_nlink=1, 
st_uid=1000, st_gid=1000, st_size=0, st_atime=1569903409, 
st_mtime=1569903409, st_ctime=1569903851)
### this path exists in the FS

 >>> f2 = pathlib.Path( "not-file1" )
 >>> f2.stat()
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
   File "/usr/lib64/python3.7/pathlib.py", line 1168, in stat
     return self._accessor.stat(self)
FileNotFoundError: [Errno 2] No such file or directory: 'not-file1'
### this path does not describe a file in the FS

 >>> f1.rename( f2 )
### here's a rename operation per the manual!
 >>> f1.stat()
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
   File "/usr/lib64/python3.7/pathlib.py", line 1168, in stat
     return self._accessor.stat(self)
FileNotFoundError: [Errno 2] No such file or directory: 'file1'
### but f1 does not represent a 'real file' any more

 >>> f2.stat()
os.stat_result(st_mode=33204, st_ino=1049466, st_dev=64769, st_nlink=1, 
st_uid=1000, st_gid=1000, st_size=0, st_atime=1569903409, 
st_mtime=1569903409, st_ctime=1569904293)
### f2 is a path which represents a real-world file though
### where have we seen those numbers before?


 > It still represents the path to the file, not the file itself, and if
 > you move something over it, it will see the new file.

I like this description. Thanks!

That said, maybe pathlib should have stuck with paths/PurePaths, and we 
should have something else (logically separate, but closely-related) 
which manipulates the files themselves?
(preferably OO, and tidying-up/offering an alternative to the morass of 
os, os.path, sys, shutil, and ?others)
-- 
Regards =dn