I have learning python over the past few weeks and I have an issue with the .loc function.
I have a dataframe (BAC) comprised of daily equity prices and dates as an index (which seems to be a datetime object). I want to filter only the dates in 2008 and do a rolling mean (30days) of the 'Close' column.
Here is my code and output :
BAC.info
OUT : <class 'pandas.core.frame.DataFrame'> DatetimeIndex: 2769 entries, 2015-12-31 to 2005-01-03 Data columns (total 5 columns):
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 2769 entries, 2015-12-31 to 2005-01-03
Data columns (total 5 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Open 2769 non-null float64
1 High 2769 non-null float64
2 Low 2769 non-null float64
3 Close 2769 non-null float64
4 Volume 2768 non-null float64
dtypes: float64(5)
memory usage: 209.8 KB
and :
BAC['Close'].loc['2008-01-01':'2009-01-01'].rolling(window=30).mean()
OUT: Series([], Name: Close, dtype: float64)
The code : BAC.loc['2008-01-01':'2009-01-01', 'Close' ].rolling(window=30).mean() yields the same result.
So, I don't get a mistake, but I think there is an issue with the format. The course I was following uses .ix, which has now been deprecated and I understood that .loc can do more or less the same thing (or .iloc if filter by column or row number).
After that, I tried :
BAC.loc['2008', 'Close' ].rolling(window=30).mean()
OUT :
2008-12-31 NaN
2008-12-30 NaN
2008-12-29 NaN
2008-12-26 NaN
2008-12-24 NaN
...
2008-01-08 35.948233
2008-01-07 35.858433
2008-01-04 35.775933
2008-01-03 35.705700
2008-01-02 35.656400
Name: Close, Length: 253, dtype: float64
So it works, but starts the rolling window at the END of 2008 and not the beginning...why is that so ?
Any help would be greatly appreciated. thanks !