Re: PSA: Make sure your Airflow instance isn't public and isn't Google indexed

Bumping this one because now Airflow is in the news over it...

On Fri, Mar 23, 2018 at 9:33 AM, James Meickle <jmeickle@xxxxxxxxxxxxxx>

> While Googling something Airflow-related a few weeks ago, I noticed that
> someone's Airflow dashboard had been indexed by Google and was accessible
> to the outside world without authentication. A little more Googling
> revealed a handful of other indexed instances in various states of
> security. I did my best to contact the operators, and waited for responses
> before posting this.
> Airflow is not a secure project by default (
> jira/browse/AIRFLOW-2047), and you can do all sorts of mean things to an
> instance that hasn't been intentionally locked down. (And even then, you
> shouldn't rely exclusively on your app's authentication for providing
> security.)
> Having "internal" dashboards/data sources/executors exposed to the web is
> dangerous, since old versions can stick around for a very long time, help
> compromise unrelated deployments, and generally just create very bad press
> for the overall project if there's ever a mass compromise (see: Redis and
> MongoDB).
> Shipping secure defaults is hard, but perhaps we could add best practices
> like instructions for deploying a robots.txt with Airflow? Or an impact
> statement about what someone could do if they access your Airflow instance?
> I think that many people deploying Airflow for the first time might not
> realize that it can get indexed, or how much damage someone can cause via
> accessing it.

