git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Jinja and non-ASCII characters (was Re: Prepare accented characters for HTML)


On 29/03/2019 12.39, Tony van der Hoff wrote:
> On 29/03/2019 11:08, Chris Angelico wrote:
>> On Fri, Mar 29, 2019 at 9:12 PM Tony van der Hoff <lists at vanderhoff.org> wrote:
>>>
>>> Hello Chris.
>>> Thanks for your interest.
>>>
>>> On 28/03/2019 18:04, Chris Angelico wrote:
>>>> On Fri, Mar 29, 2019 at 4:10 AM Tony van der Hoff <lists at vanderhoff.org> wrote:
>>>>>
>>>>> This'll probably work:
>>>>
>>>> You have a python3 shebang, but are you definitely running this under Python 3?
>>>>
>>> Absolutely.
>>>
>>>> Here's a much more minimal example. Can you see if this also fails for you?
>>>>
>>>> import sys
>>>> from jinja2 import Template
>>>> print(Template("French: {{french}}").render({"french": "ann?e"}))
>>>> print(sys.version)
>>>>
>>>
>>> Presumably you expect to run this from the command line. It works as
>>> expected:
>>>
>>> French: ann?e
>>> 3.5.3 (default, Sep 27 2018, 17:25:39)
>>> [GCC 6.3.0 20170516]
>>>
>>> However, with a slight modification:
>>>
>>> #!/usr/bin/env python3
>>>
>>> import sys
>>> from jinja2 import Template
>>> print ("Content-type: text/html\n\n")
>>
>> Try: text/html; charset=utf-8
>>
> No difference
> 
>> That might be all you need to make the browser understand it
>> correctly. Otherwise, as Thomas says, you will need to figure out
>> where the traceback is, which can probably be answered by figuring out
>> what "running it in a browser" actually means.
>>
> 
> Running in browser:
> http://localhost/~tony/private/home/learning/jinja/minimal/minimal.py
> 
> In apache2.access.log:

So it's running in apache!

Now the question is what apache is doing. Is it running it as a CGI
script? Is it doing something clever for Python files (maybe involving
Python 2?)

... wild guess: if the script is running as CGI in an enviroment with an
ASCII-using "C" locale, with Python 3.5, you wouldn't be able to print
non-ASCII characters by default. I think. In any case I remember reading
about this problem (if this is the problem) being fixed in a newer
version of Python.

> ::1 - tony [29/Mar/2019:11:22:13 +0000] "GET
> /~tony/private/home/learning/jinja/minimal/minimal.py HTTP/1.1" 200 204
> "http://localhost/~tony/private/home/learning/jinja/minimal/";
> "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko)
> Chrome/72.0.3626.81 Safari/537.36"
> ::1 - - [29/Mar/2019:11:23:04 +0000] "-" 408 0 "-" "-"
> ::1 - - [29/Mar/2019:11:23:04 +0000] "-" 408 0 "-" "-"
> 
> So, 408 is a bit unusual for localhost. With the accented character
> removed, no timeout is reported. Maybe a clue.
> 
> Can find no other traceback. Nothing relevant in apache2/error.log
> 
>