I have two tables - Customers
and Orders
. My requirement is to count the orders of every customer for each minute.
I want to use Flink SQL to implement this task. My approach is as follows:
CREATE TEMPORARY TABLE customers (
id INT,
name STRING
) WITH (
'connector' = 'jdbc',
'url' = 'jdbc:mysql://....'
);
CREATE TEMPORARY TABLE orders (
order_id STRING,
customer_id INT,
order_time TIMESTAMP(3),
WATERMARK FOR order_time AS order_time - INTERVAL '15' SECOND
) WITH (
'connector' = 'kafka',
'topic' = '...'
);
CREATE TEMPORARY VIEW order_per_minute AS
SELECT
customer_id,
count(*) as cnt,
TUMBLE_END(order_time, INTERVAL '1' MINUTE) AS window_end
FROM orders
GROUP BY customer_id, TUMBLE(tstamp, INTERVAL '1' MINUTE);
INSERT INTO destination
SELECT
COALESCE(window_end, CURRENT_TIMESTAMP),
customer_id,
COALESCE(cnt, 0),
FROM
customers LEFT JOIN order_per_minute
ON customers.id = order_per_minute.customer_id;
But it is not working as expected. Though there are 3 customers, only 2 of them are making orders every minute.
When the Flink task started, the first output was correct:
10:01, c1, 19
10:01, c2, 32
10:01, c3, 0
However, after the first minute, the output only includes the customers with orders.
10:02, c1, 18
10:02, c2, 22 // c3 is missing
10:03, c1, 18
10:03, c2, 22 // c3 is still missing
What can I do to resolve this issue? I am new to Flink and would appreciate your help in advance!