git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Handle foreign character web input


On Sat, Jun 29, 2019 at 7:01 AM Tobiah <toby at tobiah.org> wrote:
>
>
> On 6/28/19 1:33 PM, Chris Angelico wrote:> On Sat, Jun 29, 2019 at 6:31 AM Tobiah <toby at tobiah.org> wrote:
> >>
> >> A guy comes in and enters his last name as R?nngren.
> >>
> >> So what did the browser really give me; is it encoded
> >> in some way, like latin-1?  Does it depend on whether
> >> the name was cut and pasted from a Word doc. etc?
> >> Should I handle these internally as unicode?  Right
> >> now my database tables are latin-1 and things seem
> >> to usually work, but not always.
> >
> > Definitely handle them as Unicode. You'll receive them in some
> > encoding, probably UTF-8, and it depends on the browser. Ideally, your
> > back-end library (eg Flask) will deal with that for you.
> It varies by browser?
> So these records are coming in from all over the world.  How
> do people handle possibly assorted encodings that may come in?
>
> I'm using Web2py.  Does the request come in with an encoding
> built in?  Is that how people get the proper unicode object?

Yes. Normally, the browser will say "hey, here's a request body, and
this is the encoding and formatting".

Try it - see what you get.

ChrisA