0

I'm trying to understand how pandas treats datetime stamps when added to a DataFrame. On my machine a date is stored 4 hours earlier. How can I stop this from happening?

ex:

import pandas as pd  
import datetime  
test = pd.DataFrame({'A':['a','b','c'],'B':[1,2,3]})
test  
Out[31]:  
   A  B  
0  a  1  
1  b  2  
2  c  3  

dt = datetime.datetime(2016,10,4)
test['dt']=dt  
test  
Out[35]: 
   A  B         dt
0  a  1 2016-10-04
1  b  2 2016-10-04
2  c  3 2016-10-04

So far so good, but when I look at the value as an array I get:

test.dt.unique()  
Out[36]: array(['2016-10-03T20:00:00.000000000-0400'], dtype='datetime64[ns]')  

How can I keep this as 2016-10-04T00: ...
I would like to maintain it as a date object and have it the same regardless of the timezone where the code is being run?

Thanks in advance.

Stein
  • 401
  • 4
  • 6

2 Answers2

0

I found something in the hour or so since I posted this. It is far from ideal solution but will work for my purpose.

First I found some discussions on the lack of naïve time zones in NumPy for reference:
https://mail.scipy.org/pipermail/numpy-discussion/2013-April/066038.html

Note: I am on NumPy 1.8.1 and Pandas 0.14.0

For my purpose I am just going to force everything to midnight in the machines local time zone.

tz_adjust = np.timedelta64(int(-int(str(np.datetime64(datetime.datetime.now()))[-5:])/100),'h')  
test['dt']=np.datetime64(dt) + tz_adjust
test
Out[75]: 
   A  B                  dt
0  a  1 2016-10-04 04:00:00
1  b  2 2016-10-04 04:00:00
2  c  3 2016-10-04 04:00:00
test.dt.unique()
Out[76]: array(['2016-10-04T00:00:00.000000000-0400'], dtype='datetime64[ns]')
Stein
  • 401
  • 4
  • 6
0

Also see second comment from @MaxU above, reposted here.

test['dt'] = pd.to_datetime('2016-10-04', utc=True) 
Stein
  • 401
  • 4
  • 6