16

An irregular time series data is stored in a pandas.DataFrame. A DatetimeIndex has been set. I need the time difference between consecutive entries in the index.

I thought it would be as simple as

data.index.diff()

but got

AttributeError: 'DatetimeIndex' object has no attribute 'diff'

I tried

data.index - data.index.shift(1)

but got

ValueError: Cannot shift with no freq

I do not want to infer or enforce a frequency first before doing this operation. There are large gaps in the time series that would be expanded to large runs of nan. The point is to find these gaps first.

So, what is a clean way to do this seemingly simple operation?

chrisaycock
  • 36,470
  • 14
  • 88
  • 125
clstaudt
  • 21,436
  • 45
  • 156
  • 239

2 Answers2

19

There is no implemented diff function yet for index.

However, it is possible to convert the index to a Series first by using Index.to_series, if you need to preserve the original index. Use the Series constructor with no index parameter if the default index is needed.

Code example:

rng = pd.to_datetime(['2015-01-10','2015-01-12','2015-01-13'])
data = pd.DataFrame({'a': range(3)}, index=rng)  
print(data)
             a
 2015-01-10  0
 2015-01-12  1
 2015-01-13  2

a = data.index.to_series().diff()
print(a)

2015-01-10      NaT
2015-01-12   2 days
2015-01-13   1 days
dtype: timedelta64[ns]

a = pd.Series(data.index).diff()
print(a)
 0      NaT
 1   2 days
 2   1 days
dtype: timedelta64[ns]

jaga
  • 21
  • 1
  • 5
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
1

This question is a bit old but anyway...

I use numpy.diff(data.index) to get the time deltas. Working fine.

Abhishek Gurjar
  • 7,426
  • 10
  • 37
  • 45
Cunningham
  • 178
  • 2
  • 10