git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Sandboxing eval() (was: Calculator)


On Tue, Jan 21, 2020 at 4:59 PM <musbur at posteo.org> wrote:
>
> On Mon, 20 Jan 2020 06:43:41 +1100
> Chris Angelico <rosuav at gmail.com> wrote:
>
> > On Mon, Jan 20, 2020 at 4:43 AM <musbur at posteo.org> wrote:
> > > It works, but is it safe?
> >
> > As such? No.
>
> That's what many people have said, and I believe them. But just from a
> point of technical understanding: If I start with empty global and
> local dicts, and an empty __builtins__, and I screen the input string
> so it can't contain the string "import", is it still possible to have
> "targeted" malicious attacks? Of course by gobbling up memory any
> script can try and crash the Python interpteter or the whole machine
> wreaking all sorts of havoc, but by "targeted" I mean accessing the
> file system or the operating system in a deterministic way.

You would also need to provide your own __import__ function, because
otherwise you can trivially get around it by rewording things a
little. And then there are a variety of less easy exploits that
generally start by accessing a dunder off some constant.

> My own Intranet application needs to guard against accidents, not
> intentionally malicious attacks.

Hmm. You're going to have to make your own evaluation of risk vs
restriction. (And factoring in the effort required. Certain balances
of risk/restriction take more effort than others do.)

> > However, there are some elegant hybrid options, where you
> > can make use of the Python parser to do some of your work, and then
> > look at the abstract syntax tree.
>
> Sounds interesting. All I need is a few lines of arithmetic
> and variable assignments. Blocking ':' from the input should add some
> safety, too.

Cool, in that case it should be possible. But instead of trying to do
string sanitization, define your whitelist by Python's operations.
(That's what the AST module is for.) So, for instance, you might
permit Constant nodes, BinOp (maybe with a restricted set of legal
operators), Name, Compare, and maybe a few others, but disallow
Attribute. That way, the dot in "2.5" is perfectly legal, but the dot
in "(2).5" would be forbidden, as would "2.5.1" (if you evaluate that,
what it does is attempt to look up the attribute "1" on the literal
float 2.5).

Full list of AST nodes:
https://docs.python.org/3/library/ast.html#abstract-grammar

There are still a few vulnerabilities that this won't protect you
from, but they're mostly the "gobbling up memory" sort, and as you
mentioned, not an issue on the intranet. Do be aware, though, that
exponentiation can result in some pretty big numbers pretty quickly
(evaluating 9**9**9 takes..... a while). But other than that, this is
almost certainly the easiest way to make an expression evaluator that
uses Python syntax.

Now, if that's insufficient... your next option would probably be to
embed some other language, like Lua or JavaScript...

ChrisA