1

I have three ksqldb tables, whose relation is illustrated in the picture below. I must join them.

This query will result in an error:

CREATE TABLE `reviewer-email-user`
    AS SELECT *
    FROM USER
        INNER JOIN REVIEWER ON USER.USERID = REVIEWER.USERID
        INNER JOIN EMAILADDRESS ON USER.USERID = EMAILADDRESS.USERID
EMIT CHANGES;

And the error is:

Could not determine output schema for query due to error: Invalid join condition: table-table joins require to join on the primary key of the right input table. Got USER.USERID = REVIEWER.USERID

enter image description here

So, how do I join these three ksqldb tables? Thank you.

Ciro di Marzo
  • 391
  • 2
  • 15

1 Answers1

1

table-table joins expect the join condition to have the primary key on the right side and therefore, the following will not work:

ksql> CREATE TABLE reviewer_user
> AS SELECT *
> FROM REVIEWER
>      INNER JOIN USER ON USER.user_id = REVIEWER.user_id
>EMIT CHANGES;
Could not determine output schema for query due to error: Cannot add table 'REVIEWER_USER': A table with the same name already exists
Statement: CREATE TABLE REVIEWER_USER WITH (KAFKA_TOPIC='REVIEWER_USER', PARTITIONS=2, REPLICAS=1) AS SELECT *
FROM REVIEWER REVIEWER
INNER JOIN USER USER ON ((USER.USER_ID = REVIEWER.USER_ID))
EMIT CHANGES;

however the following query does work (note that I have flipped the left and right side of the join)

ksql> CREATE TABLE reviewer_user
> AS SELECT *
> FROM REVIEWER
>      INNER JOIN USER ON REVIEWER.user_id = USER.user_id
>EMIT CHANGES;

 Message
---------------------------------------------
 Created query with ID CTAS_REVIEWER_USER_11
---------------------------------------------

Another limitation of table-table joins is that it doesn't support n-way joins, so you would have to create 2 new tables(reviewer_user and email_user) by performing joins as suggested above and then finally perform a join on them to get your final result.

Bonnie Varghese
  • 2,158
  • 1
  • 16
  • 11