[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Trigerring Savepoint for the Flink Job

Hi Anil,

Glad to know that you upgrade the system to 1.4, from our experience there are quite a bit of changes requires to adapt to the new deployment model in 1.4 if I remember correctly.
The Deployment model "run detach" in AthenaX does not support reattach back to the job, we use REST API to do all the subsequent life-cycle management.

There are a couple of ways I can think of to workaround if upgrade to 1.5 is not an option:
- try to use CLI API [1] instead of REST API by replacing the life-cycle management component in WatchdogPolicy, so that you can trigger savepoints.
- try to modify the deployment model of AthenaX to not use "run detach" mode by modifying the "YarnClusterDescriptor"


Hope this can help your use case.


On Thu, May 31, 2018 at 8:38 PM, Anil <anilsingh.jsr@xxxxxxxxx> wrote:
Thanks for the reply Rong. We had updated Athenax to version 1.4.

I had checked Flink 1.4, it's rest endpoint dose not support only creating
Savepoint. It has cancel With Savepoint. I think creating Savepoint is
supported in 1.5. Since we can't upgrade to 1.5 at the moment it would like
to find a workaround for the moment.

Can you tell me how to reattaches to a running job in the cluster.