4

I am new to ksql and have been using mysql for the longest time. I would like to know is there anyway to have subqueries in KSQL?

This query works without any problem :

SELECT a.executedate, count(a.pno), sum(a.amount) FROM
tb3_withdraw_record_summary a WHERE a.status='3' GROUP BY
a.executedate;

Whereas this query returns an error message:

SELECT a.executedate, count(a.pno), sum(a.amount), (SELECT COUNT(b.pno)
FROM tb3_withdraw_record_summary b WHERE b.status='5' AND
b.executedate = a.executedate) FROM tb3_withdraw_record_summary a
WHERE a.status='3' GROUP BY a.executedate

'Failed to prepare statement: 'B' is not a valid stream/table name or alias. Caused by: 'B' is not a valid stream/table name or alias.'

Anyway for me to make this work? Thanks!

Matthias J. Sax
  • 59,682
  • 7
  • 117
  • 137
kurapika
  • 166
  • 1
  • 12

3 Answers3

4

nested query feature is not currently supported by Ksql but you can do it in following way -

1) CREATE STREAM B AS SELECT COUNT(b.pno)
FROM tb3_withdraw_record_summary b WHERE b.status='5';
2) SELECT a.executedate, count(a.pno), sum(a.amount) FROM tb3_withdraw_record_summary a JOIN B within 5 hours ON b.executedate = a.executedate WHERE a.status='3' GROUP BY a.executedate

Keep in mind that join is very different meaning then relational database world, here data is being partitioned through keys in multiple buckets and it conceptually a "colocated" join. more details about time-window here.

Hope it will help.

Varun Bajaj
  • 1,033
  • 8
  • 16
  • If I want a result of a 'Total Count', 'Count for status 3', and 'Count for status 5', does that means that I have to create another stream for status = '3'? As my current tb3_withdraw_record_summary stream is without filtering any status. – kurapika Mar 12 '19 at 02:09
  • 1
    To make life easier try to divide your bigger chunk of sql into smaller stream and join final result streams as per business logic. Robin Moffatt has provided an excellent example around similar usecase. Here is the link. - https://www.confluent.io/blog/atm-fraud-detection-apache-kafka-ksql – Varun Bajaj Mar 12 '19 at 07:00
2

SubQuery functionality is not implemented for KSQL.

https://github.com/confluentinc/ksql/issues/745

Rohit Yadav
  • 2,252
  • 16
  • 18
0

I am not familiar with KSQL, but perhaps this does what you want:

SELECT wrs.executedate,
       SUM(CASE WHEN wrs.status IN ('3') THEN 1 ELSE 0 END), 
       SUM(CASE WHEN wrs.status IN ('3') THEN amount ELSE 0 END), 
       SUM(CASE WHEN wrs.status IN ('5') THEN 1 ELSE 0 END) 
FROM tb3_withdraw_record_summary wrs
WHERE wrs.status IN ('3', '5')
GROUP BY wrs.executedate;
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • "No SUM aggregate function with null argument type exists!" Sadly this doesn't work, it returns this error message. – kurapika Mar 12 '19 at 02:01
  • @kurapika . . . That is particularly curious, because there is no `NULL`, unless `amount` is `NULL`. But `SUM()` should handle that with no problem. – Gordon Linoff Mar 12 '19 at 02:08