I'm trying to downsample a dataframe that has minute by minute data into 5 minute bins. Here is my current code:
df = pd.read_csv('stockPrices/closingPrices-apr3.csv',index_col='date',parse_dates=True)
df['close'] = df['close'].shift()
df5min = df.resample('5T').last()
print(df5min.tail())
The link to the csv file is here: https://drive.google.com/file/d/1uvkUaJwrQNsmte5IQIsJ_g5GS8RjVd8B/view?usp=sharing
The output should stop at 2019-04-03 14:40:00 because the last value is 14:48:00, and a 5 minute bin from 14:45-14:49 is not possible. However, I get the following datetime index values that don't exist in my csv file:
2019-04-03 14:45:00 286.35
2019-04-03 14:50:00 286.52
2019-04-03 14:55:00 286.32
2019-04-03 15:00:00 286.45
2019-04-03 15:05:00 280.64
The only fix I can find thus far is using the following code, but then all my data from the previous days get cut off at 14:40:
df5min = df.resample('5T').last().between_time(start_time='9:30',end_time='14:40')
Any help on this is appreciated.