In pandas, you can access specific positions of a time series either by classical integer position / row based indexing, or by datetime based indexing. The integer based index can be manipulated using basic arithmetic operations, e.g. if I have a integer_index
for a time series with frequency 12 hours and I want to access the entry exactly one day prior to this, I can simply do integer_index - 2
. However, real world data are not always perfect, and sometimes rows are missing. In this case, this method fails, and it would be helpful to be able to use datetime based indexing and subtract, for example, one day
from this index. How can I do this?
Sample script:
# generate a sample time series
import pandas as pd
s = pd.Series(["A", "B", "C", "D", "E"], index=pd.date_range("2000-01-01", periods=5, freq="12h"))
print s
2000-01-01 00:00:00 A
2000-01-01 12:00:00 B
2000-01-02 00:00:00 C
2000-01-02 12:00:00 D
2000-01-03 00:00:00 E
Freq: 12H, dtype: object
# these to indices should access the same value ("C")
integer_index = 2
date_index = "2000-01-02 00:00"
print s[integer_index] # prints "C"
print s[date_index] # prints "C"
# I can access the value one day earlier by subtracting 2 from the integer index
print s[integer_index - 2] # prints A
# how can I subtract one day from the date index?
print s[date_index - 1] # raises an error
The background to this question can be found in an earlier submission of mine here:
Fill data gaps with average of data from adjacent days
where user JohnE found a workaround to my problem that uses integer position based indexing. He makes sure that I have equally spaced data by resampling the time series.