1

I have a dataframe with 2 columns - Date and Price. The data is sorted with newest date first (23 Jan in first row, 22 Jan in second row and so on).

Date   Price
23 Jan 100
22 Jan 95
21 Jan 90
.
.
.

I want to calculate 2 days rolling average price for this time series data. I am using this:

df.rolling(2).mean()

What this does is, it assigns NaN to the first row (23 Jan) and then for the second row gives the output as the mean of prices on 23 Jan and 22 Jan. This is not useful as 22 Jan average is using forward data (price of 23 Jan). What I need is that the moving average value for 23 Jan is the average of 23 Jan & 22 Jan. This way the last value of MA would be NaN instead of first value.

What I do not want to do is sort this data with oldest first, compute and then resort.

I had the same issue with pct_change(). However, pct_change(-1) solved that issue. But rolling does not accept negative value as an input. Please suggest a workaround this issue. Thanks.

cs95
  • 379,657
  • 97
  • 704
  • 746
Manish Jain
  • 13
  • 1
  • 3
  • Why don't you want your data to be sorted chronologically (in ascending order)? Is there a particular reason? – Joe Patten Jan 23 '19 at 07:00
  • 1
    I have been developing these systems in excel and now moving them to python for easier development. I am used to seeing values sorted as newest date first. Irrespective, I would assume that python will have some way to take care of this issue. Just want to check on that. Changing the sorting would be the last resort anyways. – Manish Jain Jan 23 '19 at 07:04

1 Answers1

3

Since you don't want to sort, here is one workaround. You could reverse your dataframe, take the rolling mean, then reverse it again.

df[::-1].rolling(window=2).mean()[::-1]

Output:

        Price
23 Jan  97.5
22 Jan  92.5
21 Jan  NaN
Joe Patten
  • 1,664
  • 1
  • 9
  • 15
  • 1
    Thanks. This totally serves the purpose. Side note - do you use lot of time series data in python? Is it a usual practice to use chronologically ascending data? Does that help in applying the built-in functions? – Manish Jain Jan 23 '19 at 07:18
  • I work with a lot of panel data (think of cross-sectional and time series put together). And yes, generally data will be sorted by time in ascending order, but that does not mean you have to do it that way. Most pandas functions will work on data that is in ascending or descending order. – Joe Patten Jan 23 '19 at 07:23