0

I am struggling with setting a datimeindex with monthly data since year 1 AD, whereas I do not find the same problem with the same data with a shorter time span, say 1950 to 2020

This code works

co2data_monthly = pd.read_excel(path to my data)
co2data_monthly = co2data_monthly[co2data_monthly['year']>=1950]
dtindex = np.arange('1950-01-01', '2020-04-01', dtype='datetime64[M]')
co2data_monthly = co2data_monthly.set_index(dtindex)

This code does not work

co2data_monthly = pd.read_excel(path to my data)
co2data_monthly = co2data_monthly[co2data_monthly['year']>=1950]
dtindex = np.arange('0001-01-01', '2020-04-01', dtype='datetime64[M]')
co2data_monthly = co2data_monthly.set_index(dtindex)

So when I am trying to defined January 1st of 1 A.D. I get the following error message Out of bounds nanosecond timestamp: 1-01-01 00:00:00

What I want...

My final output should be my database with a monthly date time index since year 1 to year 2020

Jorge Alonso
  • 103
  • 11

1 Answers1

0

A pandas datetime is stored as an positive integer number of time units (nanoseconds by default) since a particular origin date (default: Unix epoch, ie 01.01.1970).

Since 01.01.0001 is way before Unix epoch, you get that error. Instead specify the Julian calender epoch (January 1, 4713 BC), and use D=Daily as the unit.

You can skip the arange when creating the index, and just use pandas.to_datetime() instead. Something like the below code, which assumes that your dates are in the "year column" (might not be correct). You may also need to specify the date format explicitly using format.

...
co2data_monthly['time'] = pandas.to_datetime(co2data_monthly['year'], unit='D', origin='julian')
co2data_monthly = co2data_monthly.set_index('time')
Jon Nordby
  • 5,494
  • 1
  • 21
  • 50