git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Interesting things about how to know it's a DAG file


Hi,

I just create a custom Dag class naming such as "MyPipeline" by extending the "DAG" class, but Airflow is failed to identify this is a DAG file.

After digging into the Airflow implementation around the dag_processing.py file:

```
# Heuristic that guesses whether a Python file contains an # Airflow DAG definition. might_contain_dag = True if safe_mode and not zipfile.is_zipfile(file_path): with open(file_path, 'rb') as f: content = f.read() might_contain_dag = all( [s in content for s in (b'DAG', b'airflow')])
```

So if the keyword "DAG" and "airflow" contained, it is a DAG file.

I don't know is there any other be more scientific way for this ?

Thanks,
Song