2

For a pandas.DataFrame example:

In: cols = ['cols1', 'cols2']
In: df = pd.DataFrame({'col1': [1, 2, 3, 4], 'col2': [3, 4, 5, 6]})

Out:       col1  col2
      0     1     3
      1     2     4
      2     3     5
      3     4     6

I am using the Exponential Weighted Average:

In: for i in range(len(df.columns)):
       df[cols[i]] = df[cols[i]].ewm(com=None, span=None, halflife=None, 
                                     alpha=.8, min_periods=0, adjust=True,
                                     ignore_na=False, axis=0).mean()

It works great! However the weights are applied with exponentially decreasing values from top down:

  Out:      col1      col2
      0  1.000000  3.000000
      1  1.833333  3.833333
      2  2.774194  4.774194
      3  3.756410  5.756410

I am wondering if there is a way to apply the weights in the reverse order (bottom up). My desired output is:

Out:          col1          col2
      0     0.9391025     2.8173075
      1     1.8494627     3.6982925
      2     2.7499995     4.5833325
      3     4.000000      6.000000
kel
  • 113
  • 1
  • 1
  • 6
  • Can you post your desired output? – user3483203 Jun 07 '18 at 18:06
  • I added the desired output. I just want the weights to be applied in reverse order to the data. My most relevant data is at the bottom of the table so I want it to have the most weight. – kel Jun 07 '18 at 18:22

2 Answers2

2

A variation on kel's solution.

Note that

df[col].reindex(index=df.index[::-1])

achieves a similar result as:

df[col].sort_index(ascending=False)

One difference is that reindex changes the index in place, while sort_index takes inplace as an option. As a result, an advantage with sort_index is that you don't need to switch things back at the end, unlike with reindex:

df[col] = df[col].sort_index(ascending=False).ewm(...).mean()

Note also that the solution with sort_index of course assumes that the index values are ordered in the right way. If they are not, I am not sure if taking a rolling average would make physical sense.

Antoine
  • 600
  • 7
  • 19
  • An even simpler solution would be to use slicing: `df[::-1][col].ewm(...).mean()`. I haven't compared the relative speeds. – Antoine Apr 12 '19 at 11:23
1

I found a solution to the problem. You can reindex before and after applying ewm():

In: df.reindex(index=df.index[:-1])
    for i in range(len(df.columns)):
        df[cols[i]] = df[cols[i]].ewm(com=None, span=None, halflife=None, 
                                      alpha=.8, min_periods=0, adjust=True,
                                      ignore_na=False, axis=0).mean()
Out:        col1      col2
      3  4.000000  6.000000
      2  3.166667  5.166667
      1  2.225806  4.225806
      0  1.243590  3.243590

Then you can apply it again:

In: df.reindex(index=df.index[:-1])
Out:        col1      col2
       0  1.243590  3.243590
       1  2.225806  4.225806
       2  3.166667  5.166667
       3  4.000000  6.000000

I am still unsure if this is the most efficient method. So, if anyone has any other ideas, I would love to know.

kel
  • 113
  • 1
  • 1
  • 6