-1

I have a table with 20+ columns, 45 millions rows. I am looking to summarize information for each Id with partition by, so that the numner of rows stay the same and information remains for each row

select min(Distance) over(partition by Id) as min_distance
, max(Distance) over(partition by Id) as max_distance
, avg(Distance) over(partition by Id) as mean_distance
, stdev(Distance) over(partition by Id) as sd_distance
, sum(Distance) over(partition by Id) as sum_distance
, min(Speed) over(partition by Id) as min_speed
, max(Speed) over(partition by Id) as max_speed
, avg(Speed) over(partition by Id) as mean_speed
, stdev(Speed) over(partition by Id) as sd_speed

A test of only 10000 rows has been running for 2 hours. I wonder if there is something we can do to improve the performance.

Kenny
  • 1,902
  • 6
  • 32
  • 61
  • Get more hardware? 2 hours is astonishingly long - I would love to see what you use as SQL Server hardware AND to see the execution plan (which you tag but then do not give - shame). – TomTom Jul 06 '18 at 12:56
  • Anyway, closing - this is a very specific question an falls into dba.stackexchange.com where the db specialists are. – TomTom Jul 06 '18 at 12:56
  • Without table definition (including indexes) this is impossible to answer. – Sean Lange Jul 06 '18 at 13:49

1 Answers1

1

Why not just:

select Id, MIN(Distance)  as min_distance
, max(Distance)  as max_distance
, avg(Distance)  as mean_distance
, stdev(Distance)  as sd_distance
, sum(Distance)  as sum_distance
, min(Speed)  as min_speed
, max(Speed)  as max_speed
, avg(Speed)  as mean_speed
, stdev(Speed)  as sd_speed
FROM mytable
GROUP BY id
user1443098
  • 6,487
  • 5
  • 38
  • 67