3

I am investigating some things for work where we have to prove the value of Kafka and KSQL as part of a POC. We are working on simple game events where a player can make a play and the game engine either assigns them points (considered a win) or they get nothing (considered a loss). One of our requirements is to prove the feasibility of KSQL on consecutive losses or wins. For the purposes of this question, I will focus on consecutive losses.

We have two tables, one for wins and one for losses that run against the same stream of game events.

The wins DDL is as follows:

CREATE TABLE TABLE__GAME_EVENTS__WINS AS
    SELECT  userId,
            gameName,
            COUNT(*) AS totalWins
    FROM STREAM__GAME_EVENTS
    WINDOW TUMBLING (SIZE 30 MINUTES)
    WHERE pointsAllocated > 0
    GROUP BY    userId,
                gameName;

The losses DDL is as follows:

CREATE TABLE TABLE__GAME_EVENTS__LOSSES AS
    SELECT  userId,
            gameName,
            COUNT(*) AS totalLosses
    FROM STREAM__GAME_EVENTS
    WINDOW TUMBLING (SIZE 30 MINUTES)
    WHERE pointsAllocated = 0
    GROUP BY    userId,
                gameName;

In order to achieve consecutive losses in the 30 minute window period, our current approach is to JOIN the WINS table onto the LOSSES table and only fire when there is nothing in the WINS table and something in the LOSSES table.

To do that, we have the following code:

SELECT  losses.userId,
        losses.gameName,
        losses.totalLosses
    FROM TABLE__GAME_EVENTS__LOSSES losses
        JOIN TABLE__GAME_EVENTS__WINS wins
            ON wins.userId = losses.userId
            AND wins.gameName = losses.gameName
    WHERE losses.totalLosses >= 1
        AND wins.totalWins = 0;

When I try to execute that code I get the following error: io.confluent.ksql.parser.tree.LogicalBinaryExpression cannot be cast to io.confluent.ksql.parser.tree.ComparisonExpression

Initially I thought it was to do with the WHERE clause but after looking at Hojjat's answer to a similar question, it appears to be that the issue has to do with the AND statement in the JOIN.

When I ran DESCRIBE EXTENDED on both tables I get the following (same for both):

ksql> DESCRIBE EXTENDED TABLE__GAME_EVENTS__WINS;

Name                 : TABLE__GAME_EVENTS__WINS
Type                 : TABLE
Key field            : KSQL_INTERNAL_COL_0|+|KSQL_INTERNAL_COL_1
Key format           : STRING
Timestamp field      : Not set - using <ROWTIME>
Value format         : JSON
Kafka topic          : TABLE__GAME_EVENTS__WINS (partitions: 4, replication: 1)

 Field     | Type
---------------------------------------
 ROWTIME   | BIGINT           (system)
 ROWKEY    | VARCHAR(STRING)  (system)
 USERID    | INTEGER
 GAMENAME  | VARCHAR(STRING)
 TOTALWINS | BIGINT
---------------------------------------

Given that COL_0 - userId and COL_1 - gameName are the key fields on both tables, it seems logical that they have to be included in the JOIN. Is it possible to get around this or is it currently not possible to JOIN on multiple columns?

Matthias J. Sax
  • 59,682
  • 7
  • 117
  • 137
KrylixZA
  • 31
  • 3

1 Answers1

3

You can't have multiple expressions in your ON clause; you can only have one.

Try moving one of them into your WHERE clause instead:

SELECT  losses.userId,
        losses.gameName,
        losses.totalLosses
    FROM TABLE__GAME_EVENTS__LOSSES losses
        JOIN TABLE__GAME_EVENTS__WINS wins
            ON wins.userId = losses.userId
    WHERE   wins.gameName = losses.gameName
        AND losses.totalLosses >= 1
        AND wins.totalWins = 0;
Robin Moffatt
  • 30,382
  • 3
  • 65
  • 92