I am going to migrate SHARK query into SPARK .
Below is my sample SHARK query which use function in group by clause.
select month(dt_cr) as Month,
day(dt_cr) as date_of_created,
count(distinct phone_number) as total_customers
from customer
group by month(dt_cr),day(dt_cr);
This same query not working in SPARK sql, it gives the below error;
Error : org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Expression not in GROUP BY.
So as a part of solution i am using below SPARK query, That is working but required code change. It is big impact on my existing project. So anyone have a better solution with minimum impact.
SELECT Month,date_of_created,count(distinct phone_number) as total_customers
FROM
(select month(dt_cr) as Month,
day(dt_cr) as date_of_created,
email
from customers)A
group by Month,date_of_created