1

I am trying to find the max value of grouping two columns together. I am trying to find the busiest hour in a day, given a month.

SELECT 
date_part('day', tpep_pickup_datetime) AS trip_day,
date_part('hour', tpep_pickup_datetime) AS trip_hour,
count(*) AS numbers
FROM nyc_yellow_2019_01
GROUP BY trip_day, trip_hour
ORDER BY trip_day, count(*) desc)

This lists all the hours every day, but i only want the top hour for each day.

I also tried creating a view table, and from that I wrote:

SELECT DISTINCT(trip_day) MAX(numbers)
FROM busy_hour
GROUP BY trip_day;

which is close but would not tell me the exact hour.

Lastly I tried a where clause on the last query:

SELECT trip_hour
FROM busy_hour
WHERE
(SELECT DISTINCT(trip_day) MAX(numbers)
FROM busy_hour
GROUP BY trip_day);

This got me an error, stating where subquery can only bring back one column.

Any help would be appreciated

GMB
  • 216,147
  • 25
  • 84
  • 135
king_sules
  • 39
  • 8

2 Answers2

1

You appear to be using Postgres, as the use of date_part() indicates.

If so, you can use distinct on:

select distinct on (trip_day)
    date_part('day', tpep_pickup_datetime) as trip_day,
    date_part('hour', tpep_pickup_datetime) as trip_hour,
    count(*) as numbers
from nyc_yellow_2019_01
group by trip_day, trip_hour
order by trip_day, numbers desc
GMB
  • 216,147
  • 25
  • 84
  • 135
1

I am trying to find the busiest hour in a day, given a month.

If you want the busiest hour per day, then use a window function:

SELECT th.*
FROM (SELECT date_part('day', tpep_pickup_datetime) AS trip_day,
             date_part('hour', tpep_pickup_datetime) AS trip_hour,
             count(*) AS numbers,
             row_number() over (partition by date_part('day', tpep_pickup_datetime) order by count(*) desc) as seqnum
      FROM nyc_yellow_2019_01
      GROUP BY trip_day, trip_hour
     ) th
WHERE seqnum = 1;
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786