Running Flink in Yarn

Question

I'm running Flink(1.4.2) on Yarn. I'm using Flink Yarn Client for submitting the job to Yarn Cluster.

Suppose I have a TM with 4 slots and I deploy a flink job with parallelism=4 with 2 container - 1 JM and 1 TM. Each parallel instance will be deployed in one task slot each in the TM (the entire job pipeline running per slot).

My jobs do a join(SQL time-windowed join on non-keyed stream) and they buffer last 3 hours of data. As per Flink docs the separate threads running in different task slot share data sets and data structures, thus reducing the per-task overhead.

My question is will these threads running in different task slot share this data buffered for join. What all data is shared across these threads.

Edit

Sample Query -

SELECT R.order_id, S.order.restaurant_id FROM awz_s3_stream1 R INNER JOIN awz_s3_stream2 S ON CAST(R.order_id AS VARCHAR) = S.order_id AND R.proctime BETWEEN S.proctime - INTERVAL '2' HOUR AND S.proctime + INTERVAL '2' HOUR GROUP BY HOP(S.proctime, INTERVAL '2' MINUTE, INTERVAL '1' HOUR), S.order.restaurant_id

score 0 · Answer 1 · answered Dec 28 '18 at 10:40

0

Each Task will receive its own disjunct partition of the input data. What is shared by the Tasks running on the same TaskManager are services and control data structures like the network stack, network connections, RPC endpoints, heartbeating between distributed components etc.

answered Dec 28 '18 at 10:40

Till Rohrmann

13,148
1
25
51

Thanks @Till Suppose the join is on two kakfa stream, R and S. If I have task slot equal to the number of Kafka partition in a single task manager, each thread in each task slot will be consumer from one Kafka partition. Will these thread not share the dataset buffered for doing a join. What's the advantage of having multiple task slot in a task manager if I'm using Yarn Deployment mode, where the entire pipeline is deployed in a single TM. – user3107673 Jan 02 '19 at 19:45
Multiple slots on a TM give you an increased parallelism (one `Task` is executed by a single Thread) without having to pay the overhead of an additional TM instance. Moreover, the communication between `Tasks` running on the same TM won't need to go through serialization-deserialization. – Till Rohrmann Jan 02 '19 at 20:07
In my case, the entire job pipeline is deployed in a single task slot, so there's no TM communication. Currently I have one slot per TM, so each TM maintains the entire data set to be joined. I was curious if there's a way for to allow the different threads across Task slot on a single TM to share this data set. Here's a link to the question I posted on Flink user group [http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Running-Flink-on-Yarn-td25222.html] – user3107673 Jan 02 '19 at 20:17
If you run the job with a parallelism of `1` and are satisfied with the performance, then there you don't need additional slots. Only if you want to increase the throughput or decrease the latency, for example, you might want to increase the parallelism. If the parallelism is `1`, then the complete join data will be send and kept on one `TM`. – Till Rohrmann Jan 03 '19 at 12:19
If I have 2 TM (say each with multiple slots), in that case will there be only 2 copy of the join data on each TM. Will task slot share the data on each TM ? – user3107673 Jan 03 '19 at 13:41
Slots on the same TM won't share data. So if you use a broadcast join, then the broadcasted data set will be replicated. In case of a partitioned join, you will partition the data wrt a key (into 4 partitions) and assign each partition to a separate slot. The partitions are disjunct but you need to shuffle the data. – Till Rohrmann Jan 03 '19 at 15:46

Running Flink in Yarn

1 Answers1