3

I have dates imported from SPSS to Python via pandas. The dates are imported as integers (ordinal). For instance the date "2015-08-02" is imported as 13657852800. When I try

pd.to_datetime(13657852800, unit="s")
Traceback (most recent call last):

  File "<ipython-input-39-ae44044ad39e>", line 1, in <module>
    pd.to_datetime(13657852800, unit="s")

  File "/anaconda3/lib/python3.7/site-packages/pandas/core/tools/datetimes.py", line 611, in to_datetime
    result = convert_listlike(np.array([arg]), box, format)[0]

  File "/anaconda3/lib/python3.7/site-packages/pandas/core/tools/datetimes.py", line 203, in _convert_listlike_datetimes
    errors=errors)

  File "pandas/_libs/tslib.pyx", line 356, in pandas._libs.tslib.array_with_unit_to_datetime

OutOfBoundsDatetime: cannot convert input with unit 's'

I also tried the following after I learn spss origin date is "1582-10-14"

pd.to_datetime(13657852800, unit="us", origin="1582-10-14")
Traceback (most recent call last):

  File "<ipython-input-38-a90cfe340ca5>", line 1, in <module>
    pd.to_datetime(13657852800, unit="us", origin="1582-10-14")

  File "/anaconda3/lib/python3.7/site-packages/pandas/core/tools/datetimes.py", line 571, in to_datetime
    arg = _adjust_to_origin(arg, origin, unit)

  File "/anaconda3/lib/python3.7/site-packages/pandas/core/tools/datetimes.py", line 379, in _adjust_to_origin
    "origin {origin} is Out of Bounds".format(origin=origin))

OutOfBoundsDatetime: origin 1582-10-14 is Out of Bounds

How to convert spss ordinal integer date into an actual date in python using panda's to_datetime function?

Chriss Paul
  • 1,101
  • 6
  • 19

2 Answers2

3

After some time, I came up with a solution to this. To make the origins of SPSS and Python match each other is necessary to rescale the integer from SPSS with the number 12219379200 i.e. the number of seconds existing between "1582-10-14" and "1970-01-01" (the origin used by to_datetime)

pd.to_datetime(13657852800-12219379200, unit="s")

Returns

Timestamp('2015-08-02 00:00:00')
Chriss Paul
  • 1,101
  • 6
  • 19
1

The problem here is that pandas have the min and max boundary for datetime object

pd.Timestamp.min
Out[349]: Timestamp('1677-09-21 00:12:43.145225')
pd.Timestamp.max
Out[350]: Timestamp('2262-04-11 23:47:16.854775807')

But in SPSS I think the min will be on year 1582, and possible way for this problem

pd.to_datetime((13657852800/86400)-141428, unit='D')
Out[348]: Timestamp('2015-08-02 00:00:00')
BENY
  • 317,841
  • 20
  • 164
  • 234