git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Use KubernetesExecutor to launch tasks into a Dask cluster in Kubernetes


@Kyle didn't see your middle message there:

You could certainly have k8s scale a Dask Cluster (I think k8s can
autoscale based on CPU and memory usage). In that case, yeah I'd say making
a DaskOperator would probably be the most straightforward way to go. You
can use almost every operator in the k8sexecutor so you'd have the benefit
of the executor elsewhere, but for this task you'd basically be launching a
pod just to monitor the Dask task and then die.

On Sun, Apr 29, 2018 at 3:47 PM Daniel Imberman <daniel.imberman@xxxxxxxxx>
wrote:

> @Kylen so what I'm trying to understand is why you would want to run a
> static DASK cluster when you can launch Dask containers/pods using the
> executor?
>
> Seems like there are a few possible options:
>
> 1.  add the Dask pip modules to the airflow docker image and call on that
> image in the executor_config whenever you need to launch a Dask task. This
> would allow you to launch Dask jobs whenever you want in an elastic manner.
> 2. If there are benefits to keeping the static Dask cluster, then writing
> a DaskOperator would be pretty straightforward. You could use the
> DaskExecutor as a scaffold and basically write an operator that sends a
> request to the Dask cluster and then monitors the job unti the task is
> finished. You could also check out the KubernetesPodOperator to see how
> that would look.
>
>
>
> On Sun, Apr 29, 2018 at 2:58 PM Kyle Hamlin <hamlin.kn@xxxxxxxxx> wrote:
>
>> Hi Fokko,
>>
>> So its always been my intention to use the KubernetesExecutor. What I'm
>> trying to figure out is how to pair the KubernetesExecutor with a
>> Dask cluster, since Dask clusters have many optimizations for ML type
>> tasks.
>>
>> On Sat, Apr 28, 2018 at 2:29 PM Driesprong, Fokko <fokko@xxxxxxxxxxxxxx>
>> wrote:
>>
>> > Also one of the main benefits of the Kubernetes Executor is having a
>> Docker
>> > image that contains all the dependencies that you need for your job.
>> > Personally I would switch to Kubernetes when it leaves the experimental
>> > stage.
>> >
>> > Cheers, Fokko
>> >
>> > 2018-04-28 16:27 GMT+02:00 Kyle Hamlin <hamlin.kn@xxxxxxxxx>:
>> >
>> > > I don't have a Dask cluster yet, but I'm interested in taking
>> advantage
>> > of
>> > > it for ML tasks. My use case would be bursting a lot of ML jobs into a
>> > > Dask cluster all at once.
>> > > From what I understand, Dask clusters utilize caching to help speed up
>> > jobs
>> > > so I don't know if it makes sense to launch a Dask cluster for every
>> > single
>> > > ML job. Conceivably, I could just have a single Dask worker running
>> 24/7
>> > > and when its time to burst k8 could autoscale the Dask workers as
>> more ML
>> > > jobs are launched into the Dask cluster?
>> > >
>> > > On Fri, Apr 27, 2018 at 10:35 PM Daniel Imberman <
>> > > daniel.imberman@xxxxxxxxx>
>> > > wrote:
>> > >
>> > > > Hi Kyle,
>> > > >
>> > > > So you have a static Dask cluster running your k8s cluster? Is there
>> > any
>> > > > reason you wouldn't just launch the Dask cluster for the job you're
>> > > running
>> > > > and then tear it down? I feel like with k8s the elasticity is one of
>> > the
>> > > > main benefits.
>> > > >
>> > > > On Fri, Apr 27, 2018 at 12:32 PM Kyle Hamlin <hamlin.kn@xxxxxxxxx>
>> > > wrote:
>> > > >
>> > > > > Hi all,
>> > > > >
>> > > > > If I have a Kubernetes cluster running in DCOC and a Dask cluster
>> > > running
>> > > > > in that same Kubernetes cluster is it possible/does it makes
>> sense to
>> > > use
>> > > > > the KubernetesExecutor to launch tasks into the Dask cluster
>> (these
>> > are
>> > > > ML
>> > > > > jobs with sklearn)? I feel like there is a bit of inception going
>> on
>> > > here
>> > > > > in my mind and I just want to make sure a setup like this makes
>> > sense?
>> > > > > Thanks in advance for anyone's input!
>> > > > >
>> > > >
>> > >
>> > >
>> > > --
>> > > Kyle Hamlin
>> > >
>> >
>>
>>
>> --
>> Kyle Hamlin
>>
>


( ! ) Warning: include(msgfooter.php): failed to open stream: No such file or directory in /var/www/git/apache-airflow-development/msg03219.html on line 183
Call Stack
#TimeMemoryFunctionLocation
10.0006368792{main}( ).../msg03219.html:0

( ! ) Warning: include(): Failed opening 'msgfooter.php' for inclusion (include_path='.:/var/www/git') in /var/www/git/apache-airflow-development/msg03219.html on line 183
Call Stack
#TimeMemoryFunctionLocation
10.0006368792{main}( ).../msg03219.html:0