git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

"glob.glob('weirdness')" Any thoughts?


On 09/09/2018 02:20 PM, Gilmeh Serda wrote:
> 
> # Python 3.6.1/Linux
> (acts the same in Python 2.7.3 also, by the way)
> 
>>>> from glob import glob
> 
>>>> glob('./Testfile *')
> ['./Testfile [comment] some text.txt']
> 
>>>> glob('./Testfile [comment]*')
> []
> 
>>>> glob('./Testfile [comment? some text.*')
> ['./Testfile [comment] some text.txt']
> 
>>>> glob('./Testfile [comment[]*')
> ['./Testfile [comment] some text.txt']
> 
>>>> glob('./Testfile [comment[] some text.*')
> []
> 
> Testing:
> 
>>>> fnmatch.translate('./Testfile [comment] some text.*')
> '(?s:\\.\\/Testfile\\ [comment]\\ some\\ text\\..*)\\Z'
> 
>>>> fnmatch.translate('./Testfile [comment? some text.*')
> '(?s:\\.\\/Testfile\\ \\[comment.\\ some\\ text\\..*)\\Z'
> 
> It seems it translates [comments] as a set of valid characters to look
> for and doesn't care about the [] characters, which is what breaks it, I
> assume.
> 
> ? translates into RegEx .
> * translates into RegEx .*
> 
> And escaping doesn't work either, because:

https://docs.python.org/3/library/glob.html#glob.escape demonstrates a 
way of escaping that works:

glob('./Testfile [[]comment]*')

> 
>>>> fnmatch.translate('./Testfile [comment\] some text.*')
> '(?s:\\.\\/Testfile\\ [comment\\\\]\\ some\\ text\\..*)\\Z'
> 
> ...the \ is replaced with \\ which breaks the pattern since it is no
> longer a real escape character but an escaped escape character, if that
> makes sense.
> 
> When the file name has a non working RegEx set, e.g., './Testfile
> comment]*' it works.
> 
> Yes, easy to say "don't write that into file names," but I don't make the
> rules.
> 
> I guess I have to write my own.
> 
> Oh, well...
>