11

I have a number of posts saved into a InnoDB table on MySQL. The table has the columns "id", "date", "user", "content". I wanted to make some statistic graphs, so I ended up using the following query to get the amount of posts per hour of yesterday:

SELECT HOUR(FROM_UNIXTIME(`date`)) AS `hour`, COUNT(date)  from fb_posts 
WHERE DATE(FROM_UNIXTIME(`date`)) = CURDATE() - INTERVAL 1 DAY GROUP BY hour

This outputs the following data:

table data

I can edit this query to get any day I want. But what I want now is the AVERAGE of each hour of every day, so that if on Day 1 at 00 hours I have 20 posts and on Day 2 at 00 hours I have 40, I want the output to be "30". I'd like to be able to pick date periods as well if it's possible.

Thanks in advance!

Daniel Sorichetti
  • 1,921
  • 1
  • 20
  • 34

3 Answers3

8

You can use a sub-query to group the data by day/hour, then take the average by hour across the sub-query.

Here's an example to give you the average count by hour for the past 7 days:

select the_hour,avg(the_count)
from
(
  select date(from_unixtime(`date`)) as the_day,
    hour(from_unixtime(`date`)) as the_hour, 
    count(*) as the_count
  from fb_posts
  where `date` >= unix_timestamp(current_date() - interval 7 day)
  and created_on < unix_timestamp(current_date())
  group by the_day,the_hour
) s
group by the_hour
Ike Walker
  • 64,401
  • 14
  • 110
  • 109
  • why do you add an "s" after the from (...) subquery? – Daniel Sorichetti May 24 '12 at 15:30
  • 1
    That's an alias for the sub-query, which is required in MySQL to avoid this error: `ERROR 1248 (42000): Every derived table must have its own alias`. You could be more verbose with it if you want and use something like `as sub_query`. – Ike Walker May 24 '12 at 15:59
  • Oh, I see. Although Linoff solution was very similar, this is the one that helped me understand SQL much more. Thanks! – Daniel Sorichetti May 24 '12 at 17:22
0

Aggregate the information by date and hour, and then take the average by hour:

select hour, avg(numposts)
from (SELECT date(`date`) as day, HOUR(FROM_UNIXTIME(`date`)) AS `hour`,
             count(*) as numposts
      from fb_posts
      WHERE DATE(FROM_UNIXTIME(`date`)) between <date1> and <date2>
      GROUP BY date(`date`), hour 
     ) d
group by hour
order by 1

By the way, I prefer including the explicit order by, since most databases do not order the results of a group by. Mysql happens to be one database that does.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • For the sake of performance you should apply the functions to the input values (not the date column) in the where clause of your sub-query. – Ike Walker May 24 '12 at 14:47
  • Also, the results are implicitly ordered by the group by, so the order by is not necessary if that's the order you want. – Ike Walker May 24 '12 at 14:48
  • 2
    This appears to be true for mysql. ANSI SQL is quite specific that group by results are not ordered (even though they often are in practice). I would hope that mysql is smart enough to ignore the "order by" clause if it doesn't need to do the work. – Gordon Linoff May 24 '12 at 14:56
  • Yes, thanks for clarifying. My comment was specific to MySQL since that's what the OP is using. – Ike Walker May 24 '12 at 15:23
0
SELECT 
    HOUR(FROM_UNIXTIME(`date`)) AS `hour`
    , COUNT(`id`) \ COUNT(DISTINCT TO_DAYS(`date`)) AS avgHourlyPostCount
FROM fb_posts 
WHERE `date` > '2012-01-01' -- your optional date criteria 
GROUP BY hour

This gives you a count of all the posts, divided by the number of days, by hour.

Andrew
  • 4,574
  • 26
  • 31