0

I got confused for the difference between "broadcast state" and broadcast() operator, and finally I got the help from a Flink expert in the following thread.

What does it mean that "broadcast state" unblocks the implementation of the “dynamic patterns” feature for Flink’s CEP library?

In the end it seems got the conclusion that "broadcast state" can store the dynamic rules in the keyed stream by RichCoFlatMap , however broadcast() operator cannot, so may I know how "broadcast state" store the dynamic rules by RichCoFlatMap and why broadcast() operator cannot store the dynamic rules by RichCoFlatMap? May I got am example for explaining it?

YuFeng Shen
  • 1,475
  • 1
  • 17
  • 41

2 Answers2

1

Those are completely two different concepts. Moreover the broadcast() is kind of a prerequisite for BroadcastState.

broadcast() specifies partitioning of data, that says that each element of the stream should be broadcasted to each parallel downstream operator.

BroadcastState is a state of operator that first of all allows to be read-write from a broadcasted stream and read from non-broadcasted one. Before that there was no way to join such two streams. Moreover this state will ensure that after restore each instance of the state across all parallel instances will be the same.

For more information on the BroadcastState have look into this docs.

Dawid Wysakowicz
  • 3,402
  • 17
  • 33
  • "Before that there was no way to join such two streams" ,how about using the broadcast() operator with CoFlatMapFunction and CheckpointedFunction? so broadcast() make sure "each element of the stream should be broadcasted to each parallel downstream operator" ,and CheckpointedFunction make sure the state would be fault tolerant, and can be rescaled . This would be achieve the same effect like "broadcast state"? – YuFeng Shen May 30 '18 at 15:38
  • Yes, you could implement it that way, and it is pretty much how the BroadcastState is implemented. Moreover one crucial additional feature that BroadcastState introduced is the possibility to iterate over all keys, which is also crucial in "controlstream" use-cases. – Dawid Wysakowicz May 30 '18 at 16:34
0

They are different concepts, BroadcastState is a storage concept, and Broadcast () is an operation whose purpose is to build BroadcastStream for you.