3

I have a data which looks like this:

Time Data
13:45:00 Data 1
13:45:03 Data 2
13:45:14 Data 3
13:45:22 Data 4
13:45:24 Data 5
00:00:03 Data 6
00:00:26 Data 7

The data have over day data but it only contains time but no date, the date I have to manually assign a new column to the dataframe and combined it with time using

df['DateTime'] = df['Date'].astype('str') + " "  + df['Time'].astype('str')

but this method can only assign one day, I want the program can auto add 1 day to the date if over 12am. Besides, you can see from 13:45:00 to 13:45:03, there is a 3 seconds time gap between them, I want to fill every time gap up and able to show data for every seconds

Desired Output:

DateTime Data
2021/4/19 13:45:00 Data 1
2021/4/19 13:45:01 Data 1
2021/4/19 13:45:02 Data 1
2021/4/19 13:45:03 Data 2
2021/4/19 13:45:04 Data 2
... ...
2021/4/20 00:00:01 Data 5
2021/4/20 00:00:02 Data 5
2021/4/20 00:00:03 Data 6
2021/4/20 00:00:04 Data 6

I have tried resample function to do the job but the resample will always do the sorting before it resampling the data which result in row of 00:00:03 will be in the first row (00:00:03 should be for next day but the function mistakenly compile them into one day). Is there a way to let pandas recognize that this dataset contains over day data and automatically plus one day to the date if exceed 12am?

Hanyi Koh
  • 327
  • 1
  • 4
  • 15

1 Answers1

2

Try this:

idx = pd.to_datetime("2021/04/19 " + df["Time"]) + pd.to_timedelta(
    pd.to_datetime(df["Time"]).diff().lt(pd.Timedelta(seconds=0)).cumsum(), unit="days"
)

dfi = df.set_index(idx)
dfi.resample("S").ffill()

Output:

                         Time    Data
Time                                 
2021-04-19 13:45:00  13:45:00  Data 1
2021-04-19 13:45:01  13:45:00  Data 1
2021-04-19 13:45:02  13:45:00  Data 1
2021-04-19 13:45:03  13:45:03  Data 2
2021-04-19 13:45:04  13:45:03  Data 2
...                       ...     ...
2021-04-20 00:00:22  00:00:03  Data 6
2021-04-20 00:00:23  00:00:03  Data 6
2021-04-20 00:00:24  00:00:03  Data 6
2021-04-20 00:00:25  00:00:03  Data 6
2021-04-20 00:00:26  00:00:26  Data 7

[36927 rows x 2 columns]
Scott Boston
  • 147,308
  • 15
  • 139
  • 187