5

After getting data of the SPY from Yahoo I create a channel of closing prices using, as you can see below, the Max and Min rolling window. The columns are HC and HL.

I need to create a column (which I have called flag) that shows 1 when the close price is equal to HC and this value continues until close will be equal to HL. At this point the value of Flag will be -1. As you can see it is very simple, the Flag can have just two values; 1 or -1.

The simple formula would be something like:

  1. if Close == HC then flag is 1
  2. if Close == HL then flag is -1
  3. if Close != HC and Close !=HL then flag is equal to the last value saved on the variable flag.

I have tried several things, including the code below, with no luck. The problem with this code is that the 0 value is shown. And I don't know how to make it disappear thru a conditional:

import pandas as pd
import pandas_datareader as dr
import numpy as np
from datetime import date

df = dr.data.get_data_yahoo('SPY',start='01-01-2019',end=date.today())

df['HC'] = df['Close'].rolling(20).max() 
df['LC'] = df['Close'].rolling(20).min() 

df['Flag'] = [1 if (df.loc[ei, 'Close'] == df.loc[ei, 'HC']) else 
              -1 if (df.loc[ei, 'Close'] == df.loc[ei, 'LC']) else   
              0 for ei in df.index]

Below you can see in blue the results of my code and in red the result that I need.

Image

Is there a simple way to do this? I would appreciate if someone could help me with it. Thank you!

MisterMiyagi
  • 44,374
  • 10
  • 104
  • 119
Martingale
  • 511
  • 1
  • 6
  • 15

3 Answers3

4

Although this is already answered, however, the fastest way to compute such result is usually using np.where as follows:

import pandas as pd
import pandas_datareader as dr
import numpy as np
from datetime import date

df = dr.data.get_data_yahoo('SPY',start='01-01-2019',end=date.today())

df['HC'] = df['Close'].rolling(20).max() 
df['LC'] = df['Close'].rolling(20).min() 

There is a nested logic in the following:

  1. Create an empty array
  2. Replace Values with -1 under condition
  3. Replace Values with 1 under condition
df['Flag'] = np.where((df.Close == df.HC), 1, 
         np.where(df.Close == df.LC, -1, np.full(df.Close.count(), np.nan)))
df.Flag.fillna(method='ffill', inplace=True)

In terms of performance:

%%timeit
df['Flag'] = np.where((df.Close == df.HC), 1, 
         np.where(df.Close == df.LC, -1, np.full(df.Close.count(), np.nan)))
df.Flag.fillna(method='ffill', inplace=True)
912 µs ± 49.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

This is definitely better that loops or nested if conditions.

For example @Tim Mironov answer:

%%timeit
pos_indexes = (df.Close == df.HC)
neg_indexes = (df.Close == df.LC)

df.loc[pos_indexes, 'Good_Flag'] = 1
df.loc[neg_indexes, 'Good_Flag'] = -1

df.fillna(method='ffill', inplace=True)
4.43 ms ± 92 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
iDrwish
  • 3,085
  • 1
  • 15
  • 24
  • 1
    This is the best answer, and the way to do it in Pandas, although I think you could just do `df['Flag'] = np.where(df['Close'] == df['HC'], 1, np.where(df['Close'] == df['LC'], -1, np.nan))`, instead of using `np.full` – m13op22 Jul 25 '19 at 13:38
  • Just what I needed. Thank you! – Martingale Jul 25 '19 at 13:56
  • 1
    Nice answer, thank you for providing an additional, better performing method! – Tim Mironov Jul 25 '19 at 14:00
2

You can use more built-in functionality of Pandas, specifically fillna method and usage of logical indexing. I've added a portion of code to your original one to create an additional Good_Flag:

import pandas_datareader as dr
from datetime import date

df = dr.data.get_data_yahoo('SPY',start='01-01-2019',end=date.today())

df['HC'] = df['Close'].rolling(20).max() 
df['LC'] = df['Close'].rolling(20).min() 

df['Flag'] = [1 if (df.loc[ei, 'Close'] == df.loc[ei, 'HC']) else 
              -1 if (df.loc[ei, 'Close'] == df.loc[ei, 'LC']) else   
              0 for ei in df.index]

pos_indexes = df.Close == df.HC
neg_indexes = df.Close == df.LC
df.loc[pos_indexes, 'Good_Flag'] = 1
df.loc[neg_indexes, 'Good_Flag'] = -1
df = df.fillna(method='ffill')

Pay attention that I used the fillna method with ffill attribute to specify a "Forward Pass".

Edit: To make thigs clear, the original Flag column is left here on purpose and the calculation of the new Good_Flag column doesn't rely on Flag.

To demonstrate the desirable behaviour:

plt.plot(df.Flag, color='blue')
plt.plot(df.Good_Flag, color='red')
plt.legend(['Flag', 'Good_Flag'])
plt.show()

enter image description here

Tim Mironov
  • 849
  • 8
  • 22
  • When I run this code and plot it, I get the same results as OP's blue line in the example plot. I think you want to change the 0 in line 6 to `np.nan`, then it works as you'd expect. – m13op22 Jul 25 '19 at 13:39
  • 1
    @HS-nebula, Actually I just left the code provided by @Martingale as it was calculating the `Flag` column. I then calculate additional `Good_Flag` column as an alternative, to demonstrate the difference between the two. Be sure, that you plot both columns. – Tim Mironov Jul 25 '19 at 13:50
0

A simple way would be to use a loop, but that's inefficient in terms of time. However, if you don't mind, you can loop through the array

flag01 = 0
for ei in df.index:
    if (df.loc[ei, 'Close'] == df.loc[ei, 'HC']):
        flag01 = 1
    if (df.loc[ei, 'Close'] == df.loc[ei, 'LC'])
        flag01 = -1
    df.loc[ei, 'Flag'] = flag01

Basically you set up zero, whenever you find the condition true, you set to 1 and keep one until the condition is met to become -1, and so on. This is slower than the approach you used, but it's the simplest way to "keep the last value" as you are increasing the value and you know which value you had before.

freerafiki
  • 539
  • 3
  • 10
  • Everytime this loop runs, you're going to change the whole 'Flag' column to whatever value `flag01` is, not the value at each row. – m13op22 Jul 25 '19 at 13:30
  • 1
    you are right, it should be `df.loc[ei, 'Flag']` ? Thanks! Anyway, the answer below is much better, it solves the problem in a more elegant and efficient way. – freerafiki Jul 25 '19 at 13:35