4

I have a column of timestamps in seconds (since midnight) with nanosecond precisions like 34200.934549345, 34205.735545344, and so on in a DataFrame df.

These timestamps come from the same day 2011-01-10.

How can I convert these seconds with nanosecond precision in the DateTime64 format of numpy?

I would like to have these entries in my df

2011-01-10 9:30:00.934549345
2011-01-10 9:30:05.735545344

I need to do the exact operation like in this example under SOLUTION of the asked question.

Is it possible?

Community
  • 1
  • 1
Plug4
  • 3,838
  • 9
  • 51
  • 79

3 Answers3

5
> df = pd.DataFrame({'seconds_since_midnight': [34200.934549345, 34205.735545344]})
> df['actual_date'] = (df.seconds_since_midnight * 1e9).astype('timedelta64[ns]') + pd.to_datetime('2011-01-10')
> df
   seconds_since_midnight                   actual_date
0            34200.934549 2011-01-10 09:30:00.934549345
1            34205.735545 2011-01-10 09:30:05.735545344

[2 rows x 2 columns]
U2EF1
  • 12,907
  • 3
  • 35
  • 37
  • this is what you want – Andy Hayden Mar 06 '14 at 01:17
  • I thought before you used: .astype('timedelta64[s]') ? – Andy Hayden Mar 06 '14 at 02:21
  • @AndyHayden And then I realized that chops off the fractional part. If you notice in the first example the dates were rounded to the nearest second (we had `2011-01-10 09:30:00` and `2011-01-10 09:30:05`). This version is correct afaict. – U2EF1 Mar 06 '14 at 02:51
  • Interesting...! I guess that makes sense now I think about it, that's how numpy can deal with pico-seconds et al. Whereas in pandas ns is the smallest. – Andy Hayden Mar 06 '14 at 03:06
  • @U2EF1 thanks! But why do I get this error `TypeError: descriptor '__add__' requires a 'datetime.datetime' object but received a 'float'` when I am testing your code – Plug4 Mar 06 '14 at 07:22
  • or I get `TypeError: unsupported operand type(s) for +: 'Timestamp' and 'float'` if I reverse the summation on the last line – Plug4 Mar 06 '14 at 07:29
  • @CharlesM pandas date/time has gone through the wringer fairly recently. This exact code works for me. What's your `pd.__version__`? Mine is `'0.13.0'`. – U2EF1 Mar 06 '14 at 08:54
  • @U2EF1 mine is still 0.12. I have problems to install the 0.13 – Plug4 Mar 06 '14 at 16:50
  • @CharlesM I would definitely recommend upgrading if you're going to be messing with timeseries, it's a lot nicer now. Otherwise get your start date in unix time (for my time zone that's `1294646400000 ms`), add it to the seconds-since-midnight column, and convert back to datetime. But don't do that, just upgrade. – U2EF1 Mar 06 '14 at 17:12
  • @U2EF1 Perfect, no choice I have to find a way to get 0.13.1 installed. I have no idea why my installation using `pip install pandas --upgrade` leads to errors all the time – Plug4 Mar 06 '14 at 17:17
1

I used strptime() to put in fractions of seconds when given a string with fractions of seconds and needing to show in microseconds. Since the number of decimal places was not defined, I had to handle all possibilities. I had to use Python 2.6.7 which required strptime to use integer seconds and did not allow the fractional part to be in the string. If I had version 2.7.6, then I could have used the %f part of the format. However, I still would have to make sure that the fractional part of the seconds had only 6 digits.

import datetime DT
def mystrptime(self, val)
  vals = val.split('.')
  if len(vals) == 1:
    dt = DT.datetime.strptime(val, '%Y-%m-%d %H%M%S')
  else:
    nofrag, frag = vals
    length = len(frag)
    if length > 6:
      frag = frag[:5]
      length = len(frag) # This resets length to 6, but is not really needed
    while length < 6:
      frag = frag + '0'
      length += 1
    nofrag_dt = DT.datetime.strptime(nofrag, '%Y-%m-%d %H%M%S')
    dt = nofrag_dt.replace(microsecond=int(frag))
  return dt

Once Python 2.7.6 or above have been installed, the %f option can be used as follows:

import datetime DT
def mystrptime(self, val)
  vals = val.split('.')
  if len(vals) > 1:
    nofrag, frag = vals
    frag = frag[:5] # This works even if frag is < 6 characters
    val = '.'.join(nofrag, frag)
  dt = DT.datetime.strptime(val, '%Y-%m-%d %H%M%S.%f')
  return dt
sabbahillel
  • 4,357
  • 1
  • 19
  • 36
0

I'm able to parse this with the datetime.strptime constructor, but I have to trim off the last 3 characters on the string:

>>> ds
'2011-01-10 9:30:00.934549345'
>>> datetime.datetime.strptime(ds[:-3], '%Y-%m-%d %H:%M:%S.%f')
datetime.datetime(2011, 1, 10, 9, 30, 0, 934549)

It appears that the final level of granularity allowed for are microseconds, which are one millionth of a second, and by definition must be within six digits:

>>> datetime.datetime(2011, 1, 10, 9, 30, 0, 934549345)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: microsecond must be in 0..999999

Since you have the time in nanoseconds, if you want to convert to Python datetime objects, you'll have to lose that level of precision, or be forced to create your own workaround.

Russia Must Remove Putin
  • 374,368
  • 89
  • 403
  • 331