0

actual code

select adjust.id, adjust.orginal_ticket_id, adjust.trans_history_id , adjust.ticket_id
    from trans_history_shared adjust
    where cdr_type=201
    and adjust.event_timestamp <= v_end_time_summary
    and adjust.create_date >= v_start_time_summary
    and adjust.create_date < v_end_time
    and exists (
        select origin.ticket_id
        from trans_history_shared origin
        where origin.ticket_id=adjust.orginal_ticket_id
        and origin.event_timestamp <= v_end_time_summary
        and origin.create_date >= v_start_time_summary
        and origin.create_date < v_end_time
        and (origin.revenue_date is null or origin.revenue_date = v_start_time_summary)
    )
    and ticket_id = (
        select latest.ticket_id
        from trans_history_shared latest
        where latest.orginal_ticket_id = adjust.orginal_ticket_id
        and cdr_type=201
        and latest.event_timestamp <= v_end_time_summary
        and latest.create_date >= v_start_time_summary
        and latest.create_date < v_end_time
        and (latest.revenue_date is null or latest.revenue_date=v_start_time_summary)
        order by latest.create_date desc, latest.event_timestamp desc
        fetch first 1 rows only);

Hi all, I'm trying to implement this query inside spark sql, but it's giving me an error saying correlated scalar subqueries must be aggregated. Could you guys perhaps help me how to fix this? Thank you for reading and have a nice day!

大陸北方網友
  • 3,696
  • 3
  • 12
  • 37
QuangTM
  • 3
  • 3
  • may by this https://stackoverflow.com/questions/40357613/what-does-correlated-scalar-subqueries-must-be-aggregated-mean can help. – PKey Sep 10 '20 at 10:38
  • unfortunately it doesn't work for my case @Plirkee – QuangTM Sep 11 '20 at 02:05
  • You could convert your sub-selects to JOINS and perhaps satisfy spark. However there may be a must easier solution (that is not necessary to say a better query - just one that works). You have qualified your column names with an alias except for the cdr_type column, which is NOT qualified in either reference. Perhaps this is the source of your error is the spark compiler/interpreter is confused on which is being referenced. Try changing these references to adjust.cdr_type and latest.cdr. – Belayer Sep 13 '20 at 21:26
  • *Tip_of_the_day*: When using table aliases always use an alias on every column, even where the column name only exists once. It removes all ambiguity from the column source. Further you will not have to look it up later. And IMHO you should always use table aliases, but that is your choice. – Belayer Sep 13 '20 at 21:28

0 Answers0