1

It is common for a GTFS time to exceed 23:59:59 due to the timetable cycle. Ie, the last time may be 25:20:00 (01:20:00 the next day), so when you convert the times to datetime, you will get an error when these times are encountered.

Is there a way to convert the GTFS time values into standard datetime format, without splitting the hour out and then converting back to a string in the correct format, to then convert it to a datetime.

t = ['24:22:00', '24:30:00', '25:40:00', '26:27:00']
'0'+str(pd.to_numeric(t[0].split(':')[0])%24)+':'+':'.join(t[0].split(':')[1:])

For the above examples, i would expect to just see

['00:22:00', '00:30:00', '01:40:00', '02:27:00']
theotheraussie
  • 495
  • 1
  • 4
  • 14

2 Answers2

2
from datetime import datetime, timedelta

def gtfs_time_to_datetime(gtfs_date, gtfs_time):
    hours, minutes, seconds = tuple(
        int(token) for token in gtfs_time.split(":")
    )
    return (
        datetime.strptime(gtfs_date, "%Y%m%d") + timedelta(
           hours=hours, minutes=minutes, seconds=seconds
        )
    )

gives the following result

>>> gtfs_time_to_datetime("20191031", "24:22:00")
datetime.datetime(2019, 11, 1, 0, 22)
>>> gtfs_time_to_datetime("20191031", "24:22:00").time().isoformat()
'00:22:00'

>>> t = ['24:22:00', '24:30:00', '25:40:00', '26:27:00']
>>> [ gtfs_time_to_datetime("20191031", tt).time().isoformat() for tt in t]
['00:22:00', '00:30:00', '01:40:00', '02:27:00']
jbernardes
  • 106
  • 5
0

I didn't find an easy way, so i just wrote a function to do it.

If anyone else wants the solution, here is mine:

from datetime import timedelta
import pandas as pd

def list_to_real_datetime(time_list, date_exists=False):
    '''
    Convert a list of GTFS times to real datetime list

    :param time_list: GTFS times
    :param date_exists: Flag indicating if the date exists in the list elements
    :return: An adjusted list of time to conform with real date times
    '''

    # new list of times to be returned
    new_time = []

    for time in time_list:

        plus_day = False
        hour = int(time[0:2])

        if hour >= 24:
            hour -= 24
            plus_day = True

        # reset the time to a real format
        time = '{:02d}'.format(hour)+time[2:]

        # Convert the time to a datetime
        if not date_exists:
            time = pd.to_datetime('1970-01-01 '+time, format='%Y-%m-%d')

        if plus_day:
            time = time + timedelta(days=1)

        new_time.append(time)

    return new_time
theotheraussie
  • 495
  • 1
  • 4
  • 14