git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Fusing operators together


Hi Shubham,

I think the EmrStepOperator and EmrStepSensor are a clear exception. Most
operators wait until the operation has finished successfully. For example,
the DruidOperator will block until the indexing job has successfully
finished:
https://github.com/apache/incubator-airflow/blob/master/airflow/hooks/druid_hook.py#L84-L109.
I think this should also be the case of the EmrStepOperator, but this
slipped through at the review. Hope this helps.

Cheers, Fokko



Op wo 14 nov. 2018 om 21:56 schreef Shubham Gupta <
y2k.shubhamgupta@xxxxxxxxx>:

> *[Please let me know if this is NOT the correct place for such a query]*
>
> Hello maintainers and committers,
> I've stumbled upon this design decision for my Airflow project. Any
> pointers would be helpful.
>
> Overview
>
>    - I'm in the process of deploying Airflow and I've felt the need to
>    merge groups of operators that form a single logical task (to clear the
>    clutter in huge DAGs)
>    - The most common use-case would be coupling an operator and the
>    corresponding sensor. For instance, one might want to chain together the
>    EmrStepOperator and EmrStepSensor
>
>
> ----
>
> Possible approaches
>
>    - This could be achieved by offloading actual logic to Hooks and then
>    using as many hooks as needed within an operator
>    - A hacky alternative (if at all) would be SubDagOperator
>
>
> ----
>
> Questions
>
>    - Are hooks the right tool for this problem?
>    - Any other way to compose operators together?
>    - Is it a good idea to combine operators at all?
>
>
> Here's <https://stackoverflow.com/questions/53308306> my complete (more
> elaborate) question on StackOverflow
>
> Thanks
>
> *Shubham Gupta*
> Software Engineer
>  zomato
>