git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

fileinput


Le mardi 29 octobre 2019 10:34:22 UTC+1, Inada Naoki a ?crit?:
> When you are reading file from stdin, fileinput doesn't open the file.
> Python open the stdin.  So openhook doesn't affect to stdin.
> 
> You can use stdin.reconfigure() to change encoding and error handler
> of the stdin.
> 
> On Tue, Oct 29, 2019 at 6:31 PM <patatetom at gmail.com> wrote:
> >
> > Le lundi 28 octobre 2019 11:48:29 UTC+1, Peter J. Holzer a ?crit :
> > > On 2019-10-25 22:12:23 +0200, Pascal wrote:
> > > > for line in fileinput.input(source):
> > > >     print(line.strip())
> > > >
> > > > -----------------------
> > > >
> > > > python3.7.4 myscript.py myfile.log
> > > > Traceback (most recent call last):
> > > > ...
> > > > UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe8 in position 799:
> > > > invalid continuation byte
> > > [...]
> > > > for line in fileinput.input(source,
> > > > openhook=fileinput.hook_encoded("utf-8", "ignore")):
> > > >     print(line.strip())
> > >
> > > The file you were trying to read was obviously not encoded in UTF-8,
> > > since you got a decode error.
> > >
> > > So the first question you should ask is:
> > >
> > > Is it supposed to be encoded in UTF-8 (and just corrupted) or is in
> > > supposed to be encoded in something else (e.g. iso-8859-1 or win-1252)?
> > >
> > > If it is supposed to be in UTF-8 but may contain errors, ignoring errors
> > > may be reasonable.
> > >
> > > If is supposed to be something else, determine what that "something
> > > else" actually is, and use that.
> > >
> > >         hp
> > >
> > > --
> > >    _  | Peter J. Holzer    | we build much bigger, better disasters now
> > > |_|_) |                    | because we have much more sophisticated
> > > | |   | hjp at hjp.at         | management tools.
> > > __/   | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>
> >
> > you're right, the log file came from Windows and was encoded in iso-8859-1, but my question was about the difference in result between reading a file and reading from stdin.
> > --
> > https://mail.python.org/mailman/listinfo/python-list
> 
> 
> 
> -- 
> Inada Naoki  <songofacandy at gmail.com>

thanks for the tip !