git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Fundamental change - Separate DAG name and id.


Hi,
though this could have been explained on Jira I think this should be discussed first.

The problem:
Airflow mixes DAG name with id. It uses same filed for both purposes.

I assume that most of you use the dag_id to describe what the DAG actually does.
For example:

dag = DAG(
    dag_id='cost_report_daily',
...
)

This dag_id is reflected to the dag id column in the UI.
Now, lets say that you want to add another task to this specific dag - You are to be extremely careful when you change the dag_id to represent the new functionality for example : dag_id='cost_expenses_reports_daily' . This will break the history of the DAG.

Or even with simpler use case.. the user just want to change the name he sees on the UI.

I suggest to have a discussion if the dag_id should be split into id (an actual id) and name to reflect what it does. When the "connection" is done by id's  - names can change as much as you want without breaking anything. essentially it becomes a field uses for display purpose  only.

* I didn't mention also the issue of DAG file name which can also cause trouble if someone wants to change it.

Sent with [ProtonMail](https://protonmail.com) Secure Email.