git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Understand Broadcast State in Node Failure Case


Thanks Fabian for the clarification! 

Best regards,
Chengzhi



On Mon, Oct 22, 2018 at 5:19 PM Fabian Hueske <fhueske@xxxxxxxxx> wrote:
Hi Chengzhi,

Broadcast State is checkpointed like any other state and will be restored in all failure cases (including the ones you mentioned).
We added the warning to inform users that Broadcast state will also be stored in the JVM memory, even if the RocksDB StateBackend was configured (which stores state on disk).
This warning is only about the size of the state, not about the consistency guarantees.

Best, Fabian

Am Mo., 22. Okt. 2018 um 19:26 Uhr schrieb Chengzhi Zhao <w.zhaochengzhi@xxxxxxxxx>:
Hey folks,

We are trying to use the broadcast state as "Shared Rule" state to filter test data in our stream pipeline, the broadcast will be connected with other streams in the pipeline.
I noticed on broadcast_state[1] important consideration page, it is mentioned No RocksDB state backend and state would be kept in in-memory at runtime. 

I am trying to figure out how it works, for example, 
1. If a node goes down, will broadcast state lost the entire state for that node and then sync from other nodes?
2. In case of the entire job fail or savepoint been triggered, how broadcast state get its state back or additional bootstrapping logic needs to be added ourselves?

Thanks for your help! 

Best regards,
Chengzhi