I am searching for a way to get the amount of active trips in a GTFS feed for each minute of a day through a postgresql query.
For every trip I have the start and endtimes (in seconds) in a denormalized table. It looks something like this:
The query to give me trips that are active for a given timeframe (e.g. here 43000 to 43600 seconds) looks like this:
SELECT
COUNT(trips.trip_id)
FROM denormalized_trips AS trips
LEFT JOIN gtfs_calendar_dates AS calendar_dates
ON calendar_dates.service_id = trips.service_id
AND calendar_dates.agency_key = trips.agency_key
AND date = '2017-07-03'
AND exception_type = 1
INNER JOIN gtfs_calendar AS calendar
ON trips.service_id = calendar.service_id
AND calendar.agency_key = trips.agency_key
AND calendar.wednesday = 1
WHERE (
trip_start_time BETWEEN 46800 AND 47100
AND '2017-07-03' BETWEEN calendar.start_date AND calendar.end_date
)
AND NOT EXISTS (
SELECT 0
FROM gtfs_calendar_dates AS date_exceptions
WHERE date = '2017-07-03'
AND date_exceptions.agency_key = trips.agency_key
AND date_exceptions.service_id = calendar.service_id
AND exception_type = 2
);
This will result in 12 trips that will start between 13:00pm and 13:05pm.
Now I want to do that for the whole day. I want to get the amount of trips that get active in an interval of e.g. 1 minute or maybe more like 5 minutes. I tried it with a loop but this seems to just give me 1 result back. Here is what I came up with for now:
CREATE OR REPLACE FUNCTION get_active_trips(n int)
RETURNS INTEGER AS
$BODY$
DECLARE
count INTEGER;
BEGIN
FOR counter IN 43130..50000 BY 60 LOOP
SELECT
COUNT(trips.trip_id)
INTO count
FROM denormalized_trips AS trips
LEFT JOIN gtfs_calendar_dates AS calendar_dates
ON calendar_dates.service_id = trips.service_id
AND calendar_dates.agency_key = trips.agency_key
AND date = '2017-07-03'
AND exception_type = 1
INNER JOIN gtfs_calendar AS calendar
ON trips.service_id = calendar.service_id
AND calendar.agency_key = trips.agency_key
AND calendar.wednesday = 1
WHERE (
trip_start_time BETWEEN counter AND counter + 60
AND '2017-07-03' BETWEEN calendar.start_date AND calendar.end_date
)
AND NOT EXISTS (
SELECT 0
FROM gtfs_calendar_dates AS date_exceptions
WHERE date = '2017-07-03'
AND date_exceptions.agency_key = trips.agency_key
AND date_exceptions.service_id = calendar.service_id
AND exception_type = 2
);
END LOOP;
RETURN count;
END;
$BODY$ LANGUAGE plpgsql STABLE;
The result of calling SELECT get_active_trips(1);
Now I would like to get something like a table or an array of results back instead of just 1 entry. How would I do that?
Any help is highly appreciated.