0

I have two tables - Customers and Orders. My requirement is to count the orders of every customer for each minute.

I want to use Flink SQL to implement this task. My approach is as follows:


CREATE TEMPORARY TABLE customers (
    id INT,
    name STRING
) WITH (
    'connector' = 'jdbc',
    'url' = 'jdbc:mysql://....'
);

CREATE TEMPORARY TABLE orders (
    order_id     STRING,
    customer_id  INT,
    order_time   TIMESTAMP(3),
    WATERMARK FOR order_time AS order_time - INTERVAL '15' SECOND
) WITH (
    'connector' = 'kafka',
    'topic' = '...'
);

CREATE TEMPORARY VIEW order_per_minute AS
SELECT
    customer_id,
    count(*) as cnt,
    TUMBLE_END(order_time, INTERVAL '1' MINUTE) AS window_end
FROM orders
GROUP BY customer_id, TUMBLE(tstamp, INTERVAL '1' MINUTE);

INSERT INTO destination
SELECT
    COALESCE(window_end, CURRENT_TIMESTAMP),
    customer_id,
    COALESCE(cnt, 0),
FROM
    customers LEFT JOIN order_per_minute
        ON customers.id = order_per_minute.customer_id;

But it is not working as expected. Though there are 3 customers, only 2 of them are making orders every minute.
When the Flink task started, the first output was correct:

10:01, c1, 19  
10:01, c2, 32  
10:01, c3, 0

However, after the first minute, the output only includes the customers with orders.

10:02, c1, 18  
10:02, c2, 22 // c3 is missing  
10:03, c1, 18  
10:03, c2, 22 // c3 is still missing

What can I do to resolve this issue? I am new to Flink and would appreciate your help in advance!

Kerie
  • 121
  • 5

0 Answers0