Here is my problem in short: I am trying to write my data (containing, among other, np.datetime64 values) to csv and then read them back, and want my times not to change...
As discussed in many places, np.datetime64 keeps everything binary and UTC in mem, but reads strings from local time.
Here is a trivial example of my problem, here pd.read_csv("foo") saved from df.to_csv("foo") results on altering the times:
In[184]: num = np.datetime64(datetime.datetime.now())
In[185]: num
Out[181]: numpy.datetime64('2015-10-28T19:19:42.408000+0100')
In[186]: df = pd.DataFrame({"Time":[num]})
In[187]: df
Out[183]:
Time
0 2015-10-28 18:19:42.408000
In[188]: df.to_csv("foo")
In[189]: df2=pd.read_csv("foo")
In[190]: df2
Out[186]:
Unnamed: 0 Time
0 0 2015-10-28 18:19:42.408000
In[191]: np.datetime64(df2.Time[0])
Out[187]: numpy.datetime64('2015-10-28T18:19:42.408000+0100')
In[192]: num == np.datetime64(df2.Time[0])
Out[188]: False
(as usual:)
import numpy as np
improt pandas as pd
There is a very large number of questions, and lots of info on the web, but i've been googling for a while now and have not been able to find an answer on how to overcome this. There should be some way to save the data in Zulu, or read them supposing UTC, but have not found any directions on which would be the best (or even good?) way to do it. I can do
In[193]: num == np.datetime64(df2.Time[0]+"Z")
Out[189]: True
but that seems to me really bad, in terms of practice, portability and efficiency... (plus its annoying when using the default save and read messes things up)