I have multiple streams that I want to join (A to B, B to C , C to D...) to create one Z
when using the table api and joining 3 tables
select * from A inner join B on a.pk_id = b.fk_id inner join C on b.pk_id = c.fk_id
what is/are the underlying state/s looks like?
the keys are different from each source, if it is running in parallel. does Flink reshuffle the data?
Asked
Active
Viewed 111 times
0

urield
- 3
- 2
1 Answers
0
You can figure this out by looking at the job graph in the web UI. There's a shuffle being done everywhere you see a HASH connection.
This information is also included in the output of EXPLAIN <query>
, but that's harder to grok (look for Exchange
).

David Anderson
- 39,434
- 4
- 33
- 60
-
I see, thanks. what about the state? what data type is being store? how can I see its' size? – urield Oct 02 '22 at 06:33
-
The SQL engine tries to keep as little state as possible. The state isn't readily visible or inspectable. You can get some idea of its size by looking at a savepoint. – David Anderson Oct 02 '22 at 10:22