0

Here is a dataframe:

                timestamp    open    high     low   close   volume
0     2023-01-03 09:30:00  3.5000  3.5800  3.5000  3.5300   1595.0
1     2023-01-03 09:35:00  3.5800  3.5800  3.5800  3.5800    102.0
2     2023-01-03 09:40:00  3.5972  3.5972  3.5972  3.5972    103.0
3     2023-01-03 09:50:00  3.5500  3.5700  3.5000  3.5200   3032.0
4     2023-01-03 09:55:00  3.4100  3.5900  3.4100  3.5900   2205.0
...                   ...     ...     ...     ...     ...      ...
1245  2023-02-21 19:35:00  6.7100  6.7200  6.7100  6.7200    739.0
1246  2023-02-21 19:40:00  6.7200  6.7300  6.7200  6.7300   2243.0
1247  2023-02-21 19:45:00  6.7300  6.7300  6.7000  6.7000   1412.0
1248  2023-02-21 19:50:00  6.7100  6.7400  6.7000  6.7400   4082.0
1249  2023-02-21 19:55:00  6.7400  6.8500  6.6500  6.7200  23678.0

[1250 rows x 6 columns]

For each day, how can I extract the lowest price in the low column between 4:00 and 9:30 and put it in a list?

Here is what I tried:

pre_market_low_list = []
for date in olhc.timestamp.dt.date.unique():
    pre_market_low_list.append(pre_market_low(olhc[olhc.timestamp.dt.date == date].low))
Guru Stron
  • 102,774
  • 10
  • 95
  • 132
David
  • 13
  • 5
  • Does this help your question? https://stackoverflow.com/questions/46839032/grouping-by-date-range-with-pandas You can group by and then do min() – UpmostScarab Feb 22 '23 at 22:08

4 Answers4

0

You can leverage some control flow logic with comparison operators to accomplish this:

pre_market_low_list = []
for date in df.timestamp.dt.date.unique():
    # Get the rows for "date" between 4:00 and 9:30
    rows = df[(df.timestamp.dt.date == date) & (df.timestamp.dt.time >= datetime.time(4,0)) & (df.timestamp.dt.time <= datetime.time(9,30))]
    
    if len(rows) > 0:
        # Get the lowest price in the low column
        pre_market_low = rows.low.min()
        pre_market_low_list.append(pre_market_low)
    else:
        # If there are no rows for the current date and time range, append None
        pre_market_low_list.append(None)

The comparison operators (e.g. >=) are used for filtering. The control flow (if / then) piece just avoids the loop throwing an error if there's no relevant data.

Hack-R
  • 22,422
  • 14
  • 75
  • 131
  • It does not seem to have `dt` in `df.timestamp`. Is it normal? – David Feb 22 '23 at 22:24
  • @David Hi David. I see you've been mentioning that to all answers, but I'm only 80% sure of what you mean. Could you please elaborate? Specifically, can you please show me the output / error from where you ran my code? – Hack-R Feb 22 '23 at 23:36
  • @David Oh, I think you have a wrong data type, probably. Try this, please: `df['timestamp'] = pd.to_datetime(df['timestamp'])` (or otherwise please show me the output) – Hack-R Feb 22 '23 at 23:42
0

You can filter your dataframe using start and end time, select needed column and turn it into list:

import datetime

start = datetime.time(4, 0, 0)
end = datetime.time(9, 30, 0)
# df['timestamp'] = pd.to_datetime(df['timestamp']) # if column is string
result = df[df['timestamp'].dt.time.between(start, end)]['low'].to_list()
Guru Stron
  • 102,774
  • 10
  • 95
  • 132
  • It does not seem to have dt in df.timestamp. Is it normal? – David Feb 22 '23 at 22:48
  • @David have you tried it? Also uncomment the `df['timestamp'] = pd.to_datetime(df['timestamp'])` if you have a string in timestamp column. – Guru Stron Feb 22 '23 at 22:49
0

Separate the day and time into columns, then filter with a query and get the lowest price/day with a groupby

import datetime as dt

df['day'] = df.timestamp.dt.date
df['time'] = df.timestamp.dt.time
df.query('@dt.time(4)<=time <= @dt.time(9, 50)').groupby('day')['low'].min()
Michael Cao
  • 2,278
  • 1
  • 1
  • 13
0

This is an approach using groupby().min() then Series.tolist()

df['timestamp'] = pd.to_datetime(df['timestamp'])

df = df[(df['timestamp'].dt.time >= pd.Timestamp('04:00').time()) &
        (df['timestamp'].dt.time >= pd.Timestamp('09:30').time())]
        
lowest_lows = df.groupby(df['timestamp'].dt.date)['low'].min().tolist()

You can also use pandas.DataFrame.between_time()

df['timestamp'] = pd.to_datetime(df['timestamp'])
df.set_index('timestamp', inplace=True)

lowest_lows = df.between_time("04:00", "09:30")["low"].tolist()
Jamiu S.
  • 5,257
  • 5
  • 12
  • 34