6

The following code throws an "Out of bounds nanosecond timestamp: 1452-04-15 00:00:00 " error. The same code works if I replace the date strings to some recent dates such as 2017-01-01.

df=pd.DataFrame({'Date':np.arange('1452-04-15', '1519-05-02', dtype='datetime64[D]')})

This example code is for providing an easy way to reproduce the error. What I am really trying to get done is to read a csv containing very early dates like these into a dataframe, and to convert the string dates into np.datetime64[D] or any comparable date format.

GoCurry
  • 899
  • 11
  • 31
  • 2
    Can the downvoter explain why you did it? I don't think my question is duplicated, and I don't see any easy way to convert a series of date strings outside [year 1677, year 2262] to a datetime-like format. – GoCurry May 10 '18 at 04:48

1 Answers1

4

You need period_range:

r = pd.period_range('1452-04-15', '1519-05-02')
print (r)
PeriodIndex(['1452-04-15', '1452-04-16', '1452-04-17', '1452-04-18',
             '1452-04-19', '1452-04-20', '1452-04-21', '1452-04-22',
             '1452-04-23', '1452-04-24',
             ...
             '1519-04-23', '1519-04-24', '1519-04-25', '1519-04-26',
             '1519-04-27', '1519-04-28', '1519-04-29', '1519-04-30',
             '1519-05-01', '1519-05-02'],
            dtype='period[D]', length=24488, freq='D')

df = pd.DataFrame({'Date' : r})
print (df.head())
        Date
0 1452-04-15
1 1452-04-16
2 1452-04-17
3 1452-04-18
4 1452-04-19

because timestamp limitations:

In [66]: pd.Timestamp.min
Out[66]: Timestamp('1677-09-21 00:12:43.145225')

In [67]: pd.Timestamp.max
Out[67]: Timestamp('2262-04-11 23:47:16.854775807')
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • Thanks! Is there a function to convert a series of date string to Period, analogs to pd.to_datetime? – GoCurry May 10 '18 at 04:30
  • @Jimbo - Is possible use `conv` function from [this](http://pandas.pydata.org/pandas-docs/stable/timeseries.html#representing-out-of-bounds-spans) ? – jezrael May 10 '18 at 04:31
  • Thanks. That's an approach. But I would be shocked if pandas doesn't provide any convenient built-in way to do this conversion. – GoCurry May 10 '18 at 04:43
  • @Jimbo - Not imlemented yet. I guess because `datetime`s are used very often, `period`s rarest. – jezrael May 10 '18 at 04:46
  • It is not rare to deal with datetime outside the range of [year 1677, year 2262]? What do people do if they don't use `period`? – GoCurry May 10 '18 at 04:50
  • @Jimbo - I think generaly using, but for `outside [year 1677, year 2262]` in pandas this is only way :( – jezrael May 10 '18 at 04:53