git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Is `airflow backfill` disfunctional?


So I'm running a backfill for what feels like the first time in years using
a simple `airflow backfill --local` commands.

First I start getting a ton of `logging.info` of each tasks that cannot be
started just yet at every tick flooding my terminal with the keyword
`FAILED` in it, looking like a million of lines like this one:

[2018-05-24 14:33:07,852] {models.py:1123} INFO - Dependencies not met for
<TaskInstance: some_dag.some_task_id 2018-01-28 00:00:00 [scheduled]>,
dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' re
quires all upstream tasks to have succeeded, but found 1 non-success(es).
upstream_tasks_state={'successes': 0L, 'failed': 0L, 'upstream_failed': 0L,
'skipped': 0L, 'done': 0L}, upstream_task_ids=['some_other_task_id']

Good thing I triggered 1 month and not 2 years like I actually need, just
the logs here would be "big data". Now I'm unclear whether there's anything
actually running or if I did something wrong, so I decide to kill the
process so I can set a smaller date range and get a better picture of
what's up.

I check my logging level, am I in DEBUG? Nope. Just INFO. So I take a note
that I'll need to find that log-flooding line and demote it to DEBUG in a
quick PR, no biggy.

Now I restart with just a single schedule, and get an error `Dag {some_dag}
has reached maximum amount of 3 dag runs`. Hmmm, I wish backfill could just
pickup where it left off. Maybe I need to run an `airflow clear` command
and restart? Ok, ran my clear command, same error is showing up. Dead end.

Maybe there is some new `airflow clear --reset-dagruns` option? Doesn't
look like it... Maybe `airflow backfill` has some new switches to pick up
where it left off? Can't find it. Am I supposed to clear the DAG Runs
manually in the UI?  This is a pre-production, in-development DAG, so it's
not on the production web server. Am I supposed to fire up my own web
server to go and manually handle the backfill-related DAG Runs? Cannot to
my staging MySQL and do manually clear some DAG runs?

So. Fire up a web server, navigate to my dag_id, delete the DAG runs, it
appears I can finally start over.

Next thought was: "Alright looks like I need to go Linus on the mailing
list".

What am I missing? I'm really hoping these issues specific to 1.8.2!

Backfilling is core to Airflow and should work very well. I want to restate
some reqs for Airflow backfill:
* when failing / interrupted, it should seamlessly be able to pickup where
it left off
* terminal logging at the INFO level should be a clear, human consumable,
indicator of progress
* backfill-related operations (including restarts) should be doable through
CLI interactions, and not require web server interactions as the typical
sandbox (dev environment) shouldn't assume the existence of a web server

Let's fix this.

Max