1

I have a dataframe of minute data for multiple Stocks, each stock has multiple sessions. See sample below

         Symbol            Time     Open    High    Low  Close  Volume  LOD
2724312   AEHR 2019-09-23 09:31:00   1.42   1.42   1.42   1.42     200  NaN
2724313   AEHR 2019-09-23 09:43:00   1.35   1.35   1.34   1.34    6062  NaN
2724314   AEHR 2019-09-23 09:58:00   1.35   1.35   1.29   1.30    8665  NaN
2724315   AEHR 2019-09-23 09:59:00   1.32   1.32   1.32   1.32     100  NaN
2724316   AEHR 2019-09-23 10:00:00   1.35   1.35   1.35   1.35     400  NaN
...        ...                 ...    ...    ...    ...    ...     ...  ...
4266341     ZI 2021-09-10 15:56:00  63.08  63.16  63.08  63.15   18205  NaN
4266342     ZI 2021-09-10 15:57:00  63.14  63.14  63.07  63.07   19355  NaN
4266343     ZI 2021-09-10 15:58:00  63.07  63.12  63.07  63.10   16650  NaN
4266344     ZI 2021-09-10 15:59:00  63.09  63.12  63.06  63.11   25775  NaN
4266345     ZI 2021-09-10 16:00:00  63.11  63.17  63.11  63.17   28578  NaN

I need the Low Of Day(LOD) for the session (9:30-4pm) up to the time in each row.

The completed df should look like this

         Symbol            Time     Open    High    Low  Close  Volume  LOD
2724312   AEHR 2019-09-23 09:31:00   1.42   1.42   1.42   1.42     200  1.42   
2724313   AEHR 2019-09-23 09:43:00   1.35   1.35   1.34   1.34    6062  1.34   
2724314   AEHR 2019-09-23 09:58:00   1.35   1.35   1.29   1.30    8665  1.29   
2724315   AEHR 2019-09-23 09:59:00   1.32   1.32   1.32   1.32     100  1.29   
2724316   AEHR 2019-09-23 10:00:00   1.35   1.35   1.35   1.35     400  1.29   
...        ...                 ...    ...    ...    ...    ...     ...  ...
4266341     ZI 2021-09-10 15:56:00  63.08  63.16  63.08  63.15   18205  63.08  
4266342     ZI 2021-09-10 15:57:00  63.14  63.14  63.07  63.07   19355  63.07  
4266343     ZI 2021-09-10 15:58:00  63.07  63.12  63.07  63.10   16650  63.07  
4266344     ZI 2021-09-10 15:59:00  63.09  63.12  63.06  63.11   25775  63.06  
4266345     ZI 2021-09-10 16:00:00  63.11  63.17  63.11  63.17   28578  63.06 

My current solution

prev_symbol = "WXYZ"
prev_low = 10000000
prev_session = datetime.date(1920, 1, 1)
session_start = 1

for i, row in df.iterrows():
    current_session = (df['Time'].iloc[i]).time()
    current_symbol = df['Symbol'].iloc[i]
    if current_symbol == prev_symbol:
        if current_session == prev_session:
            sesh_low = df.iloc[session_start:i, 'Low'].min()
            df.at[i, 'LOD'] = sesh_low
        else:
            df.at[i, 'LOD'] = df.at[i, 'Low']
            prev_session = current_session
            session_start = i
    else:
        df.at[i, 'LOD'] = df.at[i, 'Low']
        prev_symbol = current_symbol
        prev_session = current_session
        session_start = i

This returns a SettingWithCopyWarning error. Please help

1 Answers1

0

You can try .groupby() + .expanding():

# if you have values already converted/sorted, skip:
# df["Time"] = pd.to_datetime(df["Time"])
# df = df.sort_values(by=["Symbol", "Time"])

df["LOD"] = df.groupby("Symbol")["Low"].expanding().min().values
print(df)

Prints:

        Symbol                 Time   Open   High    Low  Close  Volume    LOD
2724312   AEHR  2019-09-23 09:31:00   1.42   1.42   1.42   1.42     200   1.42
2724313   AEHR  2019-09-23 09:43:00   1.35   1.35   1.34   1.34    6062   1.34
2724314   AEHR  2019-09-23 09:58:00   1.35   1.35   1.29   1.30    8665   1.29
2724315   AEHR  2019-09-23 09:59:00   1.32   1.32   1.32   1.32     100   1.29
2724316   AEHR  2019-09-23 10:00:00   1.35   1.35   1.35   1.35     400   1.29
4266341     ZI  2021-09-10 15:56:00  63.08  63.16  63.08  63.15   18205  63.08
4266342     ZI  2021-09-10 15:57:00  63.14  63.14  63.07  63.07   19355  63.07
4266343     ZI  2021-09-10 15:58:00  63.07  63.12  63.07  63.10   16650  63.07
4266344     ZI  2021-09-10 15:59:00  63.09  63.12  63.06  63.11   25775  63.06
4266345     ZI  2021-09-10 16:00:00  63.11  63.17  63.11  63.17   28578  63.06
Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91
  • apologies, how would you account for different sessions within the same ticker. This solution appears to find the low spanning all sessions for the Symbol. One session spans from 9:30 - 4:00 each day – Trippy Dippy Sep 11 '21 at 21:57
  • @AnthonyShi You can create "temporary" column identifying various sessions and the groupby by "Symbol" and this column. Something in the lines of `df.groupby(["Symbol", "temporary column name"])["Low"].expanding().min()` – Andrej Kesely Sep 11 '21 at 21:59