9

I have a SQL table that contains data of the mySQL time type as follows:

time_of_day
-----------
   12:34:56

I then use pandas to read the table in:

df = pd.read_sql('select * from time_of_day', engine)

Looking at df.dtypes yields:

time_of_day timedelta64[ns]

My main issue is that, when writing my df to a csv file, the data comes out all messed up, instead of essentially looking like my SQL table:

time_of_day
0 days 12:34:56.000000000

I'd like to instead (obviously) store this record as a time, but I can't find anything in the pandas docs that talk about a time dtype.

Does pandas lack this functionality intentionally? Is there a way to solve my problem without requiring janky data casting?

Seems like this should be elementary, but I'm confounded.

jpp
  • 159,742
  • 34
  • 281
  • 339
ryantuck
  • 6,146
  • 10
  • 57
  • 71
  • what does pandas.to_datetime give? – Ezer K Dec 28 '15 at 23:05
  • it returns `1970-01-01 12:34:56` – ryantuck Dec 28 '15 at 23:22
  • Same issue when mysql column is of type TIME, pandas return TimeDelta. I do not get this behavior when the column type is DATETIME. There's not option in Pandas read_sql method to parse the TIME. – wbg Jul 07 '17 at 19:23
  • The issue I think is NumPy hasn't implemented a `time` class, and Pandas is built on NumPy. So don't think this is natural for Pandas. – jpp Jan 02 '19 at 00:32

4 Answers4

4

Pandas does not support a time dtype series

Pandas (and NumPy) do not have a time dtype. Since you wish to avoid Pandas timedelta, you have 3 options: Pandas datetime, Python datetime.time, or Python str. Below they are presented in order of preference. Let's assume you start with the following dataframe:

df = pd.DataFrame({'time': pd.to_timedelta(['12:34:56', '05:12:45', '15:15:06'])})

print(df['time'].dtype)  # timedelta64[ns]

Pandas datetime series

You can use Pandas datetime series and include an arbitrary date component, e.g. today's date. Underlying such a series are integers, which makes this solution the most efficient and adaptable.

The default date, if unspecified, is 1-Jan-1970:

df['time'] = pd.to_datetime(df['time'])

print(df)

#                  time
# 0 1970-01-01 12:34:56
# 1 1970-01-01 05:12:45
# 2 1970-01-01 15:15:06

You can also specify a date, such as today:

df['time'] = pd.Timestamp('today').normalize() + df['time']

print(df)

#                  time
# 0 2019-01-02 12:34:56
# 1 2019-01-02 05:12:45
# 2 2019-01-02 15:15:06

Pandas object series of Python datetime.time values

The Python datetime module from the standard library supports datetime.time objects. You can convert your series to an object dtype series containing pointers to a sequence of datetime.time objects. Operations will no longer be vectorised, but each underlying value will be represented internally by a number.

df['time'] = pd.to_datetime(df['time']).dt.time

print(df)

#        time
# 0  12:34:56
# 1  05:12:45
# 2  15:15:06

print(df['time'].dtype)
# object

print(type(df['time'].at[0]))
# <class 'datetime.time'>

Pandas object series of Python str values

Converting to strings is only recommended for presentation purposes that are not supported by other types, e.g. Pandas datetime or Python datetime.time. For example:

df['time'] = pd.to_datetime(df['time']).dt.strftime('%H:%M:%S')

print(df)

#        time
# 0  12:34:56
# 1  05:12:45
# 2  15:15:06

print(df['time'].dtype)
# object

print(type(df['time'].at[0]))
# <class 'str'>
jpp
  • 159,742
  • 34
  • 281
  • 339
1

it's a hack, but you can pull out the components to create a string and convert that string to a datetime.time(h,m,s) object

def convert(td):
    time = [str(td.components.hours), str(td.components.minutes), 
    str(td.components.seconds)]
    return datetime.strptime(':'.join(time), '%H:%M:%S').time()

df['time'] = df['time'].apply(lambda x: convert(x))
Infamouse
  • 21
  • 6
0

found a solution, but i feel like it's gotta be more elegant than this:

def convert(x):
    return pd.to_datetime(x).strftime('%H:%M:%S')

df['time_of_day'] = df['time_of_day'].apply(convert)
ryantuck
  • 6,146
  • 10
  • 57
  • 71
-1
df['time_of_day'] = pd.to_datetime(df['time_of_day']).apply(lambda x: x.time())

Adapted this code

gies0r
  • 4,723
  • 4
  • 39
  • 50