There are couple of reasons:
- Easier resource allocation and isolation: one faulty job
doesn't affect another.
- Mix and match of Flink version: you can leave the old stable
jobs run with the old Flink version, and use the latest version of
Flink for new jobs.
- Faster metrics collection: Flink generates a lots of metrics,
by keeping each cluster small, our Prometheus instance can scrape
their metrics a lot faster.
On 10/26/2018 2:50 PM, Sayat
Thanks for the advice, Klein. Could you please
share more details why it's best to allocate for each job a
You can have multiple Flink
clusters on the same set of physical machines. In our experience, it's best
to deploy a separate Flink cluster for each job and
adjust the resource accordingly.
Flink Cluster in standalone with HA
configuration. It has 6 Task managers and each has 8
slots. Overall, 48 slots for the cluster.
>>If you cluster only have one task manager
with one slot in each node, then the job should be
Agree, this will solve the issue. However, the
cluster is running other jobs and in this case it
won't have hardware resource for other jobs.
How are your task managers deploy ?
If you cluster only have one task manager with one
slot in each node,
then the job should be spread evenly.
On 10/24/2018 4:35 PM, Sayat Satybaldiyev wrote:
> Is there any way to indicate flink not to
allocate all parallel tasks
> on one node? We have a stateless flink job
that reading from 10
> partition topic and have a parallelism of 6.
Flink job manager
> allocates all 6 parallel operators to one
machine, causing all traffic
> from Kafka allocated to only one machine. We
have a cluster of 6 nodes
> and ideal to spread one parallel operator to
one machine. Is there a
> way to do than in Flink?