I have a dataframe with the following infos:
Departure Time Offset Dep Arrival Time Offset Arr
0 07:10 +01:00 08:25 +01:00
1 09:05 +01:00 10:10 +01:00
2 10:50 +01:00 12:05 +01:00
3 11:55 +01:00 14:15 +00:00
4 14:55 +02:00 18:40 +01:00
df.dtypes
Departure Time object
Offset Departure object
Arrival Time object
Offset Arrival object
dtype: object
I would like to calculate the time duration: Arrival Time + Offset Arr - Departure Time - Offset Dep
I first tried to convert all of them to the time format but I could only do this with the actual times, not the time offsets:
df["Arrival Time"] = pd.to_datetime(df ["Arrival Time"]).dt.time
df["Departure Time"] = pd.to_datetime(df ["Departure Time"]).dt.time
So my problem is on the one side to conver the offset columns to a format I can use for time calculation and then how to effectively calculate the time duration.
As I want to use the time duration for a data science calculation (Gradient Boosting), it would be great if you could suggest a duration format that can be plugged into the algorithm right away.