git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

答复: How Airflow import modules as it executes the tasks


I think there might be two ways:

1. Setup the connections via. the Airflow UI: http://airflow.readthedocs.io/en/latest/configuration.html#connections, I guess this could be done in your code also.

2. Put your connection setup into a operator at the begin of your dag

________________________________
发件人: alireza.khoshkbari@ <gmail.com alireza.khoshkbari@xxxxxxxxx>
发送时间: 2018年5月16日 1:21
收件人: dev@xxxxxxxxxxxxxxxxxx
主题: How Airflow import modules as it executes the tasks

To start off, here is my project structure:
├── dags
│   ├── __init__.py
│   ├── core
│   │   ├── __init__.py
│   │   ├── operators
│   │   │   ├── __init__.py
│   │   │   ├── first_operator.py
│   │   └── util
│   │       ├── __init__.py
│   │       ├── db.py
│   ├── my_dag.py

Here is the versions and details of the airflow docker setup:

In my dag in different tasks I'm connecting to db (not Airflow db). I've setup db connection pooling,  I expected that my db.py would be be loaded once across the DagRun. However, in the log I can see that each task imports the module and new db connections made by each and every task. I can see that db.py is loaded in each task by having the line below in db.py:

logging.info("I was loaded {}".format(random.randint(0,100)))

I understand that each operator can technically be run in a separate machine and it does make sense that each task runs sort of independently. However, not sure that if this does apply in case of using LocalExecutor. Now the question is, how I can share the resources (db connections) across tasks using LocalExecutor.


( ! ) Warning: include(msgfooter.php): failed to open stream: No such file or directory in /var/www/git/apache-airflow-development/msg03375.html on line 100
Call Stack
#TimeMemoryFunctionLocation
10.0008364696{main}( ).../msg03375.html:0

( ! ) Warning: include(): Failed opening 'msgfooter.php' for inclusion (include_path='.:/var/www/git') in /var/www/git/apache-airflow-development/msg03375.html on line 100
Call Stack
#TimeMemoryFunctionLocation
10.0008364696{main}( ).../msg03375.html:0