2

I'm using SQL in SAS.

I'm doing a SQL query with a GROUP BY clause on a continuous variable (made discrete), and I'd like it to aggregate more. I'm not sure this is clear, so here is an example.

Here is my query :

SELECT CEIL(travel_time) AS time_in_mn, MEAN(foo) AS mean_foo
FROM my_table
GROUP BY CEIL(travel_time)

This will give me the mean value of foo for each different value of travel_time. Thanks to the CEIL() function, it will group by minutes and not seconds (travel_time can take values such as 14.7 (minutes)). But I'd like to be able to group by groups of 5 minutes, for instance, so that I have something like that :

time_in_mn  mean_foo
5           4.5
10          3.1
15          17.6
20          12

(Of course, the mean(foo) should be done over the whole interval, so for time_in_mn = 5, mean_foo should be the mean of foo where travel_time in (0,1,2,3,4,5) )

How can I achieve that ?

(Sorry if the answer can be found easily, the only search term I could think of is group by step, which gives me a lot of "step by step tutorials" about SQL...)

Mureinik
  • 297,002
  • 52
  • 306
  • 350
François M.
  • 4,027
  • 11
  • 30
  • 81

1 Answers1

6

A common idiom of "ceiling to steps" (or rounding, or flooring, for that matter) is to divide by the step, ceil (or round, or floor, of course) and then multiply by it again. This way, if we take, for example, 12.4:

  1. Divide: 12.4 / 5 = 2.48
  2. Ceil: 2.48 becomes 3
  3. Multiply: 3 * 5 = 15

And in SQL form:

SELECT   5 * CEIL(travel_time / 5.0) AS time_in_mn,
         MEAN(foo) AS mean_foo
FROM     my_table
GROUP BY 5 * CEIL(travel_time / 5.0)
Mureinik
  • 297,002
  • 52
  • 306
  • 350
  • Thank you. It almost gives the desired result, except for the fact that each line is repeated identically for as many values of travel_time that there are (I think). However, I found how to have only one line for each duplicate, I suggested an edit :). In case it is not accepted, the edit consists of doing `GROUP BY time_in_mn` instead of `GROUP BY 5 * CEIL(travel_time / 5.0)` – François M. Mar 04 '16 at 10:04