-1

I am trying to calculate RSI on a dataframe

df = pd.DataFrame({"Close": [100,101,102,103,104,105,106,105,103,102,103,104,103,105,106,107,108,106,105,107,109]})

df["Change"] = df["Close"].diff()

df["Gain"] = np.where(df["Change"]>0,df["Change"],0)

df["Loss"] = np.where(df["Change"]<0,abs(df["Change"]),0 )
df["Index"] = [x for x in range(len(df))]

print(df)

      Close  Change  Gain  Loss  Index
0     100     NaN   0.0   0.0      0
1     101     1.0   1.0   0.0      1
2     102     1.0   1.0   0.0      2
3     103     1.0   1.0   0.0      3
4     104     1.0   1.0   0.0      4
5     105     1.0   1.0   0.0      5
6     106     1.0   1.0   0.0      6
7     105    -1.0   0.0   1.0      7
8     103    -2.0   0.0   2.0      8
9     102    -1.0   0.0   1.0      9
10    103     1.0   1.0   0.0     10
11    104     1.0   1.0   0.0     11
12    103    -1.0   0.0   1.0     12
13    105     2.0   2.0   0.0     13
14    106     1.0   1.0   0.0     14
15    107     1.0   1.0   0.0     15
16    108     1.0   1.0   0.0     16
17    106    -2.0   0.0   2.0     17
18    105    -1.0   0.0   1.0     18
19    107     2.0   2.0   0.0     19
20    109     2.0   2.0   0.0     20


RSI_length = 7

Now, I am stuck in calculating "Avg Gain". The logic for average gain here is for first average gain at index 6 will be mean of "Gain" for RSI_length periods. For consecutive "Avg Gain" it should be

(Previous Avg Gain * (RSI_length - 1) + "Gain") / RSI_length 

I tried the following but does not work as expected

df["Avg Gain"] = np.nan
df["Avg Gain"] = np.where(df["Index"]==(RSI_length-1),df["Gain"].rolling(window=RSI_length).mean(),\
                          np.where(df["Index"]>(RSI_length-1),(df["Avg Gain"].iloc[df["Index"]-1]*(RSI_length-1)+df["Gain"]) / RSI_length,np.nan))

The output of this code is:

print(df)

     Close  Change  Gain  Loss  Index  Avg Gain
0     100     NaN   0.0   0.0      0       NaN
1     101     1.0   1.0   0.0      1       NaN
2     102     1.0   1.0   0.0      2       NaN
3     103     1.0   1.0   0.0      3       NaN
4     104     1.0   1.0   0.0      4       NaN
5     105     1.0   1.0   0.0      5       NaN
6     106     1.0   1.0   0.0      6  0.857143
7     105    -1.0   0.0   1.0      7       NaN
8     103    -2.0   0.0   2.0      8       NaN
9     102    -1.0   0.0   1.0      9       NaN
10    103     1.0   1.0   0.0     10       NaN
11    104     1.0   1.0   0.0     11       NaN
12    103    -1.0   0.0   1.0     12       NaN
13    105     2.0   2.0   0.0     13       NaN
14    106     1.0   1.0   0.0     14       NaN
15    107     1.0   1.0   0.0     15       NaN
16    108     1.0   1.0   0.0     16       NaN
17    106    -2.0   0.0   2.0     17       NaN
18    105    -1.0   0.0   1.0     18       NaN
19    107     2.0   2.0   0.0     19       NaN
20    109     2.0   2.0   0.0     20       NaN

Desired output is:

    Close  Change   Gain  Loss  Index  Avg Gain
0     100      NaN     0     0      0       NaN
1     101      1.0     1     0      1       NaN
2     102      1.0     1     0      2       NaN
3     103      1.0     1     0      3       NaN
4     104      1.0     1     0      4       NaN
5     105      1.0     1     0      5       NaN
6     106      1.0     1     0      6  0.857143
7     105     -1.0     0     1      7  0.734694
8     103     -2.0     0     2      8  0.629738
9     102     -1.0     0     1      9  0.539775
10    103      1.0     1     0     10  0.605522
11    104      1.0     1     0     11  0.661876
12    103     -1.0     0     1     12  0.567322
13    105      2.0     2     0     13  0.771990
14    106      1.0     1     0     14  0.804563
15    107      1.0     1     0     15  0.832483
16    108      1.0     1     0     16  0.856414
17    106     -2.0     0     2     17  0.734069
18    105     -1.0     0     1     18  0.629202
19    107      2.0     2     0     19  0.825030
20    109      2.0     2     0     20  0.992883

Ravan
  • 35
  • 7
  • Are you certain that the formula "(Previous Avg Gain * (RSI_length - 1) + "Gain") / RSI_length" is correct for RSI? It doesn't seem to be taking 'loss' into account. – Roy2012 May 23 '20 at 15:39
  • @Roy2012 This is not the final rsi value, this formula is for calculating Average gain and not RSI – Ravan May 23 '20 at 16:17
  • I believe your implementation isn't working because at each cell, it tries to look at the value of the previous cell (index -1) - before the value of the previous cell was calculated. – Roy2012 May 23 '20 at 16:48
  • Still it should work for index 7 – Ravan May 23 '20 at 17:07
  • Emm. I believe the culprit is with the expression df["Avg Gain"].iloc[df["Index"]-1]. Even though it shifts everything by 1, the indexes stay the same. That is, instead of getting a series with index 0-20, and a value at index 6, you get a series of index 20-0-19, still - with a value at index 6. – Roy2012 May 23 '20 at 18:47

1 Answers1

0

(edited)

Here's an implementation of your formula.

RSI_LENGTH = 7

rolling_gain = df["Gain"].rolling(RSI_LENGTH).mean()
df.loc[RSI_LENGTH-1, "RSI"] = rolling_gain[RSI_LENGTH-1]

for inx in range(RSI_LENGTH, len(df)):
    df.loc[inx, "RSI"] = (df.loc[inx-1, "RSI"] * (RSI_LENGTH -1) + df.loc[inx, "Gain"]) / RSI_LENGTH

The result is:

    Close  Change  Gain  Loss  Index       RSI
0     100     NaN   0.0   0.0      0       NaN
1     101     1.0   1.0   0.0      1       NaN
2     102     1.0   1.0   0.0      2       NaN
3     103     1.0   1.0   0.0      3       NaN
4     104     1.0   1.0   0.0      4       NaN
5     105     1.0   1.0   0.0      5       NaN
6     106     1.0   1.0   0.0      6  0.857143
7     105    -1.0   0.0   1.0      7  0.734694
8     103    -2.0   0.0   2.0      8  0.629738
9     102    -1.0   0.0   1.0      9  0.539775
10    103     1.0   1.0   0.0     10  0.605522
11    104     1.0   1.0   0.0     11  0.661876
12    103    -1.0   0.0   1.0     12  0.567322
13    105     2.0   2.0   0.0     13  0.771990
14    106     1.0   1.0   0.0     14  0.804563
15    107     1.0   1.0   0.0     15  0.832483
16    108     1.0   1.0   0.0     16  0.856414
17    106    -2.0   0.0   2.0     17  0.734069
18    105    -1.0   0.0   1.0     18  0.629202
19    107     2.0   2.0   0.0     19  0.825030
20    109     2.0   2.0   0.0     20  0.992883
Roy2012
  • 11,755
  • 2
  • 22
  • 35
  • Hi, I have edited my output as I took RSI_length as 14 and not 7 in previous output. Its updated now. – Ravan May 23 '20 at 16:31
  • Are you showing the RSI, or then result of your partial formula? As you mentioned above, they are not the same. – Roy2012 May 23 '20 at 16:33
  • Your implementation is only correct for value at index 7 but not for the later part. For index 8 I want (0.734694 * 6 + 0)/7 = 0.629738 and your solution take same output. It should update previous gain as each row is filled – Ravan May 23 '20 at 16:34
  • This formula mentioned is for Average gain part of RSI and not the final RSI – Ravan May 23 '20 at 16:35
  • @jerzrael if you can help me on this – Ravan May 23 '20 at 16:41
  • Got it. Let me look at it again - or drop my answer because it's currently incorrect. – Roy2012 May 23 '20 at 16:44
  • Have a look at this updated answer. Does it do the right thing? – Roy2012 May 23 '20 at 18:30
  • Yes this works perfectly as expected. Btw is there any way to solve this by vectorisation rather than using loops. – Ravan May 24 '20 at 05:05
  • If it solves the question - could you please mark it as the correct answer? As to vectorization - I don't think so, but I'll give it another try. – Roy2012 May 24 '20 at 05:13