0

I have the following few samples of data:

             close
date
2018-11-13  192.23
2018-11-12  194.17
2018-11-09  204.47
2018-11-08  208.49
2018-11-07  209.95
2018-11-06  203.77
2018-11-05  201.59
2018-11-02  207.48
2018-11-01  222.22
2018-10-31  218.86
2018-10-30  213.30
2018-10-29  212.24
2018-10-26  216.30
2018-10-25  219.80
2018-10-24  215.09
...

Using this window,

resampleStr = '2D'

and this code:

res = pd.concat([
           df['close'].rolling(wind).apply(lambda x : (x[-1] - x[-0]) / x[-1]).reset_index(),
           df.reset_index()['date'].shift(-wind).rename('T-' + resampleStr),
           df.reset_index()['close'].rename('today'),
           df.reset_index()['close'].shift(-wind).rename('T-' + resampleStr)
           ],
           axis=1
    )
res = res.dropna()

print(res)

I get this (partial) result. How can this be? For example, the first roll, (194.17 - 208.49) / 208.49 = -0.06869, and yet the result shows -0.009991?

          date     close       T-2D   today    T-2D
1   2018-11-12 -0.009991 2018-11-08  194.17  208.49
2   2018-11-09 -0.050374 2018-11-07  204.47  209.95
3   2018-11-08 -0.019282 2018-11-06  208.49  203.77
4   2018-11-07 -0.006954 2018-11-05  209.95  201.59
5   2018-11-06  0.030328 2018-11-02  203.77  207.48
6   2018-11-05  0.010814 2018-11-01  201.59  222.22
7   2018-11-02 -0.028388 2018-10-31  207.48  218.86
8   2018-11-01 -0.066331 2018-10-30  222.22  213.30
9   2018-10-31  0.015352 2018-10-29  218.86  212.24
10  2018-10-30  0.026067 2018-10-26  213.30  216.30
11  2018-10-29  0.004994 2018-10-25  212.24  219.80
12  2018-10-26 -0.018770 2018-10-24  216.30  215.09
13  2018-10-25 -0.015924 2018-10-23  219.80  222.73
14  2018-10-24  0.021898 2018-10-22  215.09  220.65

EDIT 1

Inside the lambda, as per rafaelc suggestion, I did a lambda x : print(x) or then the rest of the code. It prints out these (partial) values. It is not using the window!!! WTF?

2018-11-13    192.23
2018-11-12    194.17
dtype: float64
date
2018-11-12    194.17
2018-11-09    204.47
dtype: float64
date
2018-11-09    204.47
2018-11-08    208.49
dtype: float64
date
2018-11-08    208.49
2018-11-07    209.95
dtype: float64
date
2018-11-07    209.95
2018-11-06    203.77
dtype: float64
date
2018-11-06    203.77
2018-11-05    201.59
dtype: float64
date
2018-11-05    201.59
2018-11-02    207.48
dtype: float64
date
2018-11-02    207.48
2018-11-01    222.22
dtype: float64
date
2018-11-01    222.22
2018-10-31    218.86
dtype: float64
date
2018-10-31    218.86
2018-10-30    213.30
dtype: float64
date
2018-10-30    213.30
2018-10-29    212.24
dtype: float64
date
2018-10-29    212.24
2018-10-26    216.30
dtype: float64
date
2018-10-26    216.3
2018-10-25    219.8
dtype: float64
date
2018-10-25    219.80
2018-10-24    215.09
dtype: float64
Ivan
  • 7,448
  • 14
  • 69
  • 134

1 Answers1

2

The values for 194.17 and 208.49 are for 2018-11-12 and 2018-11-09 respectively. They are never part of a 2-day window, which is what you defined.

rafaelc
  • 57,686
  • 15
  • 58
  • 82
  • Look at the line: 2018-11-12 -0.009991 2018-11-08 194.17 208.49. That is correct. On 11-12, the stock is at (TODAY) 194.17. Two days earlier (2018-11-8) it was at 208. 49. The return is as I said, which is not what the lambda expression returns! – Ivan Jun 06 '22 at 20:04
  • @Ivan then, you should work with a rolling window of 2 items, not 2 days. Stop using a datetime index. Instead, have indexes be `[0, 1, 2, 3....`]. Then, use `.rolling(2)` instead of `.rolling('2D')`. – rafaelc Jun 06 '22 at 20:06
  • rafael,c Hmmm, could that be the problem? I am confused though, the lambda appears to being handed the correct date/values???? – Ivan Jun 06 '22 at 20:08
  • @Ivan take a second check if the lambda is taking the correct values as you think ;). For that, do something like this: `lambda x : print(x) or (x[-1] - x[-0]) / x[-1])`. This will print(x) and then return the value `(x[-1] - x[-0]) / x[-1])`. Will keep the same behavior, but will print out which values are being processed. You'll see they're not always what you think they are. – rafaelc Jun 06 '22 at 20:10
  • rafaelc will do. BTW, I just tried 2 instead of '2D' - identical results – Ivan Jun 06 '22 at 20:10
  • rafaelc, your print statement is showing that I am NOT getting a 2 window. I am completely confused. See OP, EDIT 1 – Ivan Jun 06 '22 at 20:15
  • @Ivan you see? The output you posted is for a 2-item rolling window. If you do this for a `'2D'` (2-day) rolling window, you'll see that it'd be very different. – rafaelc Jun 06 '22 at 20:18
  • Yes, thanks rafaelc - this helps although I am still stuck – Ivan Jun 06 '22 at 20:39
  • rafaelc, how do I get the index of the rolling result, in priint(x) or .... – Ivan Jun 06 '22 at 20:52
  • In your lambda function, `x` is a `pd.Series` object. So you can just call `x.index` to see the index. In your case, `x.index` should return something like `[2018-11-02, 2018-11-01]` if you have `date` as the dataframe index, and use '2D` as the window; or return something like `[0, 1]` if you have `date` as a normal column and not the index, and you use `2` as the window. – rafaelc Jun 06 '22 at 20:54