git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Question regarding Streaming Resources


Hi Bhaskar,

I assume you don’t have 1000 streams, but rather one (keyed) stream with 1000 different key values, yes?

If so, then this one stream is physically partitioned based on the parallelism of the operator following the keyBy(), not per unique key.

The most common per-key “resource” is the memory required for each key's state, if you’ve got any operations that need to maintain state (accumulators, windows, etc).

For 1000 unique keys, this should be negligible.

— Ken


On Sep 12, 2018, at 9:55 AM, bhaskar.ebay77@xxxxxxxxx wrote:

Hi

I have created a KeyedStream with state as explained below
For example i have created 1000 streams,  out of which 50% of streams data is going to come once in 8 hours. Will the resources of these under utilized streams are idle for that duration? Or Flink internal task manager is having some strategy to utilize them for other new streams that are coming?
Regards
Bhaskar

--------------------------
Ken Krugler
+1 530-210-6378
http://www.scaleunlimited.com
Custom big data solutions & training
Flink, Solr, Hadoop, Cascading & Cassandra