I have a table with 20+ columns, 45 millions rows. I am looking to summarize information for each Id with partition by, so that the numner of rows stay the same and information remains for each row
select min(Distance) over(partition by Id) as min_distance
, max(Distance) over(partition by Id) as max_distance
, avg(Distance) over(partition by Id) as mean_distance
, stdev(Distance) over(partition by Id) as sd_distance
, sum(Distance) over(partition by Id) as sum_distance
, min(Speed) over(partition by Id) as min_speed
, max(Speed) over(partition by Id) as max_speed
, avg(Speed) over(partition by Id) as mean_speed
, stdev(Speed) over(partition by Id) as sd_speed
A test of only 10000 rows has been running for 2 hours. I wonder if there is something we can do to improve the performance.