55

I have a df time series. I extracted the indexes and want to convert them each to datetime. How do you go about doing that? I tried to use pandas.to_datetime(x) but it doesn't convert it when I check after using type()

user1234440
  • 22,521
  • 18
  • 61
  • 103

9 Answers9

109

Just try to_pydatetime()

>>> import pandas as pd
>>> t = pd.tslib.Timestamp('2016-03-03 00:00:00')
>>> type(t)
pandas.tslib.Timestamp
>>> t.to_pydatetime()
datetime.datetime(2016, 3, 3, 0, 0)

Change to datetime.date type

>>> t.date()
datetime.date(2016, 3, 3)
Mike Williamson
  • 4,915
  • 14
  • 67
  • 104
GoingMyWay
  • 16,802
  • 32
  • 96
  • 149
  • 10
    as a note: `pd.to_datetime()` is now depreciated, `pd.to_pydatetime()` is the new standard – mjp Jul 27 '17 at 20:51
  • 5
    `AttributeError: module 'pandas' has no attribute 'to_pydatetime'` – SeF May 23 '22 at 14:07
5

Just an update to the question, I have tried the most upvoted answer, and it gives me this warning

usr/local/lib/python3.5/dist-packages/IPython/core/interactiveshell.py:2910: FutureWarning: to_datetime is deprecated. Use self.to_pydatetime() exec(code_obj, self.user_global_ns, self.user_ns)

And suggest me to use to_pydatetime()

For example

sample = Timestamp('2018-05-02 10:08:54.774000')

sample.to_datetime() will return datetime.datetime(2018, 4, 30, 10, 8, 54, 774000)

Morris wong
  • 163
  • 2
  • 9
  • thanks for the heads up, Morris. Looks like it's a warning, not an error, though, unless it stops the code from running. – LShapz May 02 '18 at 03:15
4

I had the same issue, and tried the solution from @aikramer2, to add a column to my df of type 'datetime.datetime', but again i got a pandas data type:

#libraries used -
import pandas as pd
import datetime as dt

#loading data into a pandas df, from a local file. note column [1] contains a datetime column -
savedtweets = pd.read_csv('/Users/sharon/Documents/ipython/twitter_analysis/conftwit.csv', sep='\t', 
                      names=['id', 'created_at_string', 'user.screen_name', 'text'], 
                      parse_dates={"created_at" : [1]})
print int(max(savedtweets['id'])) #535073416026816512
print type(savedtweets['created_at'][0]) # result is <class 'pandas.tslib.Timestamp'>

# add a column specifically using datetime.datetime library -
savedtweets['datetime'] = savedtweets['created_at'].apply(lambda x: dt.datetime(x.year,x.month,x.day))
print type(savedtweets['datetime'][0]) # result is <class 'pandas.tslib.Timestamp'>

i suspect pandas df cannot store a datetime.datetime data type. I got success when i made a plain python list to store the datetime.datetime values:

savedtweets = pd.read_csv('/Users/swragg/Documents/ipython/twitter_analysis/conftwit.csv', sep='\t', 
                      names=['id', 'created_at_string', 'user.screen_name', 'text'], 
                      parse_dates={"created_at" : [1]})
print int(max(savedtweets['id'])) #535073416026816512
print type(savedtweets['created_at'][0]) # <class 'pandas.tslib.Timestamp'>
savedtweets_datetime= [dt.datetime(x.year,x.month,x.day,x.hour,x.minute,x.second) for x in savedtweets['created_at']]
print savedtweets_datetime[0] # 2014-11-19 14:13:38
print savedtweets['created_at'][0] # 2014-11-19 14:13:38
print type(dt.datetime(2014,3,5,2,4)) # <type 'datetime.datetime'>
print type(savedtweets['created_at'][0].year) # <type 'int'>
print type(savedtweets_datetime) # <type 'list'>
sharon
  • 4,406
  • 1
  • 17
  • 10
4

As an alternative solution if you have two separate fields (one for date; one for time):

Convert to datetime.date

df['date2'] = pd.to_datetime(df['date']).apply(lambda x: x.date())

Convert to datetime.time

df['time2'] = pd.to_datetime(df['time']).apply(lambda x: x.time())

Afterwards you can combine them:

df['datetime'] = df.apply(lambda r : pd.datetime.combine(r['date2'],r['time2']),1)

Adapted this post

gies0r
  • 4,723
  • 4
  • 39
  • 50
3

Assuming you are trying to convert pandas timestamp objects, you can just extract the relevant data from the timestamp:

#Create the data
data = {1: tslib.Timestamp('2013-01-03 00:00:00', tz=None), 2: tslib.Timestamp('2013-01-04 00:00:00', tz=None), 3: tslib.Timestamp('2013-01-03 00:00:00', tz=None)}

#convert to df
df = pandas.DataFrame.from_dict(data, orient = 'index')
df.columns = ['timestamp']

#generate the datetime
df['datetime'] = df['timestamp'].apply(lambda x: datetime.date(x.year,x.month,x.day))

Of course, if you need seconds, minutes, and hours, you can include those as arguments for the function datetime.datetime as well.

aikramer2
  • 1,283
  • 12
  • 9
2
import time
time.strftime("%H:%M",  time.strptime(str(x), "%Y-%m-%d %H:%M:%S"))

Note: x should be pandas.tslib.Timestamp (as it is in the question)

Kostyantyn
  • 5,041
  • 3
  • 34
  • 30
1

This works for me, to create date for insert in MySQL, please try:

pandas_tslib = pandas_tslib.to_pydatetime()
pandas_tslib = "'" + pandas_tslib.strftime('%Y-%m-%d') + "'"
zhm
  • 3,513
  • 3
  • 34
  • 55
giovannivl
  • 91
  • 1
  • 8
1

You can convert a Timestamp to a Python datetime object with to_pydatetime(), but it seems that when applied to an entire column that conversion is thwarted:

>>> ts = pd.tslib.Timestamp.now()
>>> type(ts)
<class 'pandas._libs.tslibs.timestamps.Timestamp'>
>>> type(ts.to_pydatetime())
<class 'datetime.datetime'>
>>> df = pd.DataFrame({"now": [datetime.datetime.utcnow()] * 10})
>>> type(df['now'].iloc[0])
<class 'pandas._libs.tslibs.timestamps.Timestamp'>
>>> df['now2'] = df['now'].apply(lambda dt: dt.to_pydatetime())
>>> type(df['now2'].iloc[0])
<class 'pandas._libs.tslibs.timestamps.Timestamp'>

Not sure what to make of that. (There are some situations where Pandas' Timestamp object isn't a perfect replacement for Python's datetime object, and you want the real thing.)

smontanaro
  • 1,537
  • 3
  • 15
  • 26
0

In my case I could not get a correct output even when specifying the format: I used to get always the year 1970.

Actually what solved my problem was to specify the unit parameter to the function since my timestamps have seconds granularity:

df_new = df
df_new['time'] = pandas.to_datetime(df['time'], unit='s')
roschach
  • 8,390
  • 14
  • 74
  • 124