git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Single Airflow Instance Vs Multiple Airflow Instance


We have had up to 50 dags with multiple tasks each. Many of them run in parallel, we've had some issues with compute as it was meant to be a temporary deployment but somehow it's now the permanent production one and resources are not great. 
Oranisationally it is very similar to what Gerard described. More than one group working with different engineering practices and standards, this is probably one of the sources of problems. 

-----Original Message-----
From: Gerard Toonstra <gtoonstra@xxxxxxxxx> 
Sent: Wednesday, June 6, 2018 5:02 PM
To: dev@xxxxxxxxxxxxxxxxxxxxxxxxxxxx
Subject: Re: Single Airflow Instance Vs Multiple Airflow Instance

We are using two cluster instances. One cluster is for the engineering teams that are in the "tech" wing and which rigorously follow tech principles, the other instance is for use by business analysts and more ad-hoc, experimental work, who do not necessarily follow the principles. We have a nomad engineer helping out the ad-hoc cluster, setting it up, connecting it to all systems and resolving programming questions. All clusters are fully puppetized, so we reuse configs and ways how things are configured, plus have a common "platform code" package that is reused across both clusters.

G>


On Wed, Jun 6, 2018 at 5:50 PM, James Meickle <jmeickle@xxxxxxxxxxxxxx>
wrote:

> An important consideration here is that there are several settings 
> that are cluster-wide. In particular, cluster-wide concurrency 
> settings could result in Team B's DAG refusing to schedule based on an error in Team A's DAG.
>
> Do your teams follow similar practices in how eagerly they ship code, 
> or have similar SLAs for resolving issues? If so, you are probably 
> fine using co-tenancy. If not, you should probably talk about it first 
> to make sure the teams are okay with co-tenancy.
>
> On Wed, Jun 6, 2018 at 11:24 AM, gauthiermartin86@xxxxxxxxx < 
> gauthiermartin86@xxxxxxxxx> wrote:
>
> > Hi Everyone,
> >
> > We have been experimenting with airflow for about 6 months now.
> > We are planning to have multiple departments to use it. Since we 
> > don't have any internal experience with Airflow we are wondering if 
> > single instance per department is more suited than single instance 
> > with multi-tenancy? We have been aware about the upcoming release of 
> > airflow
> > 1.10 and changes that will be made to the RBAC which will be more 
> > suited for multi-tenancy.
> >
> > Any advice on this ? Any tips could be helpful to us.
> >
>

This e-mail message and any attachments are confidential and are for the exclusive use of the addressee only.  If you are not the intended recipient, you should not use the content, place any reliance on it or disclose it to anyone else.  Please notify the sender immediately by replying to it and then ensure that it is deleted from your system (including any attachments).


( ! ) Warning: include(msgfooter.php): failed to open stream: No such file or directory in /var/www/git/apache-airflow-development/msg03602.html on line 113
Call Stack
#TimeMemoryFunctionLocation
10.0010368792{main}( ).../msg03602.html:0

( ! ) Warning: include(): Failed opening 'msgfooter.php' for inclusion (include_path='.:/var/www/git') in /var/www/git/apache-airflow-development/msg03602.html on line 113
Call Stack
#TimeMemoryFunctionLocation
10.0010368792{main}( ).../msg03602.html:0