0

I have multiple streams that I want to join (A to B, B to C , C to D...) to create one Z
when using the table api and joining 3 tables
select * from A inner join B on a.pk_id = b.fk_id inner join C on b.pk_id = c.fk_id
what is/are the underlying state/s looks like?
the keys are different from each source, if it is running in parallel. does Flink reshuffle the data?

urield
  • 3
  • 2

1 Answers1

0

You can figure this out by looking at the job graph in the web UI. There's a shuffle being done everywhere you see a HASH connection.

This information is also included in the output of EXPLAIN <query>, but that's harder to grok (look for Exchange).

David Anderson
  • 39,434
  • 4
  • 33
  • 60
  • I see, thanks. what about the state? what data type is being store? how can I see its' size? – urield Oct 02 '22 at 06:33
  • The SQL engine tries to keep as little state as possible. The state isn't readily visible or inspectable. You can get some idea of its size by looking at a savepoint. – David Anderson Oct 02 '22 at 10:22