I have a data frame df
:
df =
index date hats
A1 01-01-2020 5
A1 02-01-2020 10
A1 03-01-2020 16
A1 04-01-2020 16
A1 21-01-2020 9
A1 22-01-2020 8
A1 23-01-2020 7
A6 20-03-2020 5
A6 21-03-2020 5
A8 30-07-2020 12
Here, the first four rows are consecutive days. I want to know the start date and end date of all such consecutive days in the data frame. If there is only one day in a series like wise A8
index in the df
then the start date and and end will be same. Moreover, I am also interested in knowing the highest value in df['hats']
column in the series of consecutive days and return its date in a seperate column high_hat
along with its date high_hat_date
. If there are two or more equal high values in a series of consecutive days then write the number of occurrence of high value in a new column num_hat
, and write the first occurrence date in high_hat_date
.
The example output for the above data frame is as follows:
index start_date end_date high_hat high_hat_date num_hat
A1 01-01-2020 04-01-2020 16 03-01-2020 2
A1 21-01-2020 23-01-2020 9 21-01-2020 1
A6 20-03-2020 21-03-2020 5 20-03-2020 2
A8 30-07-2020 30-07-2020 12 30-07-2020 1
Any help in this regard is highly appreciated.