1

I have multiples strings representing timestamps. A few examples may be : 19551231 (%Y%m%d) or 20210216154500 (%Y%m%d%H%M%S). As you can see, the format may vary.

I'm looking for a way to convert all these different strings to a unique DatetimeWithNanoseconds format.

I know I can convert a timestamp to DatetimeWithNanoseconds using integers like this: DatetimeWithNanoseconds(2020, 6, 22, 17, 1, 30, nanosecond=0).

Does that means that I have to manually parse every string I get to get the relevant integers ? Is there a better way to do this ? Like the way the function strptime works (using strings like %Y%m%d to determine the layout of the string)

cuzureau
  • 330
  • 2
  • 17

2 Answers2

0

You offered 8- and 14-character example timestamps. It appears you want to tack on 9 or more zeros, converting them to uniform 23-character human-readable timestamps. At that point it would be straightforward to put it in rfc 3339 format and call from_rfc3339() to obtain a DatetimeWithNanoseconds.

Consider using a simple while loop:

while len(ts) < 23:
    ts += '0'
return ts

A better way to accomplish the same thing:

return ts + '0' * (23 - len(ts))

EDIT

You will want a couple of helpers here. Each one is unit testable, and offers a very simple API.

First one turns everything into uniform 23-char human-readable timestamps as I mentioned above.

Second would take the first 14 characters and turn it into integer seconds since epoch. Then tack on the nanoseconds. I have something like this in mind:

import datetime as dt

def to_nanosec(stamp: str):
    assert 23 == len(stamp), stamp
    d = dt.datetime.strptime(stamp[:14], '%Y%d%m%H%M%S')
    return 1e9 * d.timestamp() + int(stamp) % 1e9

Equivalently that 2nd term could be … + int(stamp[14:])

Prefer int(1e9), or 1_000_000_000, if returning an int is important.

You certainly could break out character ranges and put punctuation like : colon and Z between them prior to calling from_rfc3339(), but .strptime() might be more convenient here.


It's worth noting that numpy offers support for nanosecond precision.

J_H
  • 17,926
  • 4
  • 24
  • 44
  • Unfortunately `from_rfc3339()` raise `ValueError` because it does not accept my string `19551231` as the format does not match something like this : `2019-10-12T07:20:50.52Z`. Adding zero at the end of my string does not help my string to be rfc3339. Is there a way to convert it properly ? `strptime()` does not help here because it convert `19551231` to `1955-12-31 00:00:00` which is still not accepted by the `from_rfc3339()` function. – cuzureau Sep 07 '21 at 08:04
  • Plus: the function `from_rfc3339()` returns a `` and not a `` – cuzureau Sep 07 '21 at 08:10
  • Ouch! We get back a microsecond datetime, which won't support nanosecond? Sigh! – J_H Sep 08 '21 at 00:42
0

I learned that from a datetime format it is easy to extract hours for example just by calling date.hour (same for year, month, etc).

Knowing this, the way to convert a string to a DatetimeWithNanoseconds format takes these 2 easy steps:

  1. Convert the string to a datetime format:
date = '19551231'
date = datetime.datetime.strptime(date, '%Y%m%d')
  1. Convert to DatetimeWithNanoseconds:
nano = DatetimeWithNanoseconds(date.year, date.month, date.day, date.hour, date.minute, date.second, nanosecond=0)
cuzureau
  • 330
  • 2
  • 17