Pandas detect overday date

Question

I have a data which looks like this:

Time	Data
13:45:00	Data 1
13:45:03	Data 2
13:45:14	Data 3
13:45:22	Data 4
13:45:24	Data 5
00:00:03	Data 6
00:00:26	Data 7

The data have over day data but it only contains time but no date, the date I have to manually assign a new column to the dataframe and combined it with time using

df['DateTime'] = df['Date'].astype('str') + " "  + df['Time'].astype('str')

but this method can only assign one day, I want the program can auto add 1 day to the date if over 12am. Besides, you can see from 13:45:00 to 13:45:03, there is a 3 seconds time gap between them, I want to fill every time gap up and able to show data for every seconds

Desired Output:

DateTime	Data
2021/4/19 13:45:00	Data 1
2021/4/19 13:45:01	Data 1
2021/4/19 13:45:02	Data 1
2021/4/19 13:45:03	Data 2
2021/4/19 13:45:04	Data 2
...	...
2021/4/20 00:00:01	Data 5
2021/4/20 00:00:02	Data 5
2021/4/20 00:00:03	Data 6
2021/4/20 00:00:04	Data 6

I have tried resample function to do the job but the resample will always do the sorting before it resampling the data which result in row of 00:00:03 will be in the first row (00:00:03 should be for next day but the function mistakenly compile them into one day). Is there a way to let pandas recognize that this dataset contains over day data and automatically plus one day to the date if exceed 12am?

score 2 · Answer 1 · answered Apr 19 '21 at 03:36

Try this:

idx = pd.to_datetime("2021/04/19 " + df["Time"]) + pd.to_timedelta(
    pd.to_datetime(df["Time"]).diff().lt(pd.Timedelta(seconds=0)).cumsum(), unit="days"
)

dfi = df.set_index(idx)
dfi.resample("S").ffill()

Output:

                         Time    Data
Time                                 
2021-04-19 13:45:00  13:45:00  Data 1
2021-04-19 13:45:01  13:45:00  Data 1
2021-04-19 13:45:02  13:45:00  Data 1
2021-04-19 13:45:03  13:45:03  Data 2
2021-04-19 13:45:04  13:45:03  Data 2
...                       ...     ...
2021-04-20 00:00:22  00:00:03  Data 6
2021-04-20 00:00:23  00:00:03  Data 6
2021-04-20 00:00:24  00:00:03  Data 6
2021-04-20 00:00:25  00:00:03  Data 6
2021-04-20 00:00:26  00:00:26  Data 7

[36927 rows x 2 columns]

This is exactly what I am looking for! Thank you! – Hanyi Koh Apr 19 '21 at 04:42 — Hanyi Koh, Apr 19 '21 at 04:42

Pandas detect overday date

1 Answers1