0

I am trying to extract the minimum value for each day in a dataset containing hourly prices. This I want to do for every hour separately since I later want to add other information to each hour, before combining the dataset again (which is why I want to keep the hour in datetime).

This is my data:

                     Price_REG1  Price_REG2  ...  Price_24_3  Price_24_4
date                                         ...                        
2020-01-01 00:00:00       30.83       30.83  ...         NaN         NaN
2020-01-01 01:00:00       28.78       28.78  ...         NaN         NaN
2020-01-01 02:00:00       28.45       28.45  ...       30.83       30.83
2020-01-01 03:00:00       27.90       27.90  ...       28.78       28.78
2020-01-01 04:00:00       27.52       27.52  ...       28.45       28.45

To extract the minimum I use this command:

df_min_1 = df_hour_1[['Price_REG1', 'Price_REG2', 'Price_REG3', 
                 'Price_REG4']].between_time('00:00', '23:00').resample('d').min()

Which leaves me with this:

            Price_REG1  Price_REG2  Price_REG3  Price_REG4
date                                                      
2020-01-01       25.07       25.07       25.07       25.07
2020-01-02       12.07       12.07       12.07       12.07
2020-01-03        0.14        0.14        0.14        0.14
2020-01-04        3.83        3.83        3.83        3.83
2020-01-05       25.77       25.77       25.77       25.77

I understand that the resample does this, but I want to know if there is any way to avoid this, or if there is any other way to achieve the results I am after.

To clarify, this is what I would like to have:

                     Price_REG1  Price_REG2  Price_REG3  Price_REG4
date                                                      
2020-01-01 01:00:00       25.07       25.07       25.07       25.07
2020-01-02 01:00:00       12.07       12.07       12.07       12.07
2020-01-03 01:00:00        0.14        0.14        0.14        0.14
2020-01-04 01:00:00        3.83        3.83        3.83        3.83
2020-01-05 01:00:00       25.77       25.77       25.77       25.77

1 Answers1

0

I did not find a nice solution to this problem, I managed to get where I want though with this method:

t = datetime.timedelta(hours=1)

df_min = df_min.reset_index()

df_min['date'] = df_min['date'] + t

df_min.set_index('date', inplace = True)

df_hour_1 = pd.concat([df_hour_1, df_min], axis=1)

That is, I first create a timedelta of 01:00:00, I then reset the index to be able to add the timedelta to the date column. In this way, I am able to contact df_hour and df_min, while still keeping the time so I can concat all 24 datasets in a later step.