Flink broadcast state with more than 1 parallelism

Question

Let me just put it out, I am a very beginner of Flink and trying to grab concepts as much as possible.

Lets say, I have a flink cluster with 10 task managers. I have a flink job running on each. The job uses a broadcast state as well. This broadcast state is created by reading 5 S3 files every 10mins, do some processing, and create map of int to list of strings which is broadcasted.

Question: Where does reading of files happen, is it at JobManager who reads and processes the file and sends over the processed content to task managers.

Or

is it the task managers who does all reading and processing. If it is this case, then how does flink make sure that if a task manager fail to read from S3, the broadcast state is same at all task managers.

EDIT

so task manager read the broadcast stream and broadcast it to the downstream tasks.

Eg. Let's say there is a Kafka stream with 5 partitions which need to be broadcasted. There is a downstream operator with a parallelism of 5 as well.

Partition 1 consumer task, reads element from stream and set it in broadcast state. As soon as this is set, the state is broadcasted to all downstream operator 5 tasks.
Partition 2 consumer task, reads element from stream and set it in broadcast state.

Question: At this point, do we need to make sure that we do not overwrite the elements from partition 1, when we set the broadcast state from partition 2 element or flink itself manages this.

OR

Also how can we be sure that by the time partition 2 consumed an element and set the broadcast state, the partition 1 broadcasted state had reached partition 2 downstream operator task.

score 2 · Answer 1 · answered Oct 21 '19 at 12:22

Where does reading of files happen?

TaskManager. JobManager is only responsible for managing the task like scheduling and failover.

How to send over the processed content to task managers?

You can simply imagine the broadcast state process as sending the same message to all the downstream tasks instead of sending to a specific one.

How does flink handle that if a task manager fail to read from S3?

If a source task fails to read from S3, I believe there will be a restart (maybe a full restart or maybe a partial restart) and the checkpoint mechanism will make sure the consistency of state.

The broadcast state is same at all task managers.

Actually the broadcast state is not exactly the same in all tasks. The reason is that the events can't be guaranteed to be delivered to task in the same order during the network transfer.

Thanks, if the broadcast stream has a parallelism of eg. 3, then do I need to make sure while processing broadcast stream element at each task that the broadcast state written by other task is not overwritten or flink take care of it. — Gaurav Kumar, Oct 21 '19 at 15:05
Yes but it depends. For example, you want to broadcast some kind of rules with a unique rule id for each rule. And the rule message with the same rule id may appear in any one of the broadcast source tasks, then you should make sure the downstream tasks get the latest one in this situation. — Jiayi Liao, Oct 22 '19 at 01:15

Flink broadcast state with more than 1 parallelism

1 Answers1