How can i get the average delivery time (stored as timedelta64[ns]) per day on Pandas?

Question

So, i have the following data frame on pandas:

I need to create another data frame with the average delivery time by day, by i'm having a hardtime with this task.

So, as you can see, I have all the 365 days of 2018 and I need to calculate what was the average time the user had to wait before getting his order delivered. I only need the hours, minutes and seconds, because the number of days will never be grater than 0. And the delivery time is timedelta64 format.

Thank you folks!

I think you want `df.groupby(df.date.dt.date).delivery_time.mean()`. — Trenton McKinney, Jan 24 '21 at 17:47

k33da_the_bug · Accepted Answer · 2021-01-24T15:31:28.567

User groupby and mean

import pandas as pd
from io import StringIO

#Data preprocessing(ignore)
data = StringIO('''
2018-01-01 ,0 days 00:58:26 
2018-01-01 ,0 days 01:27:04 
2018-01-01 ,0 days 00:17:27 
2018-01-01 ,0 days 00:14:26 
2018-01-01 ,0 days 01:08:33 
''')

#Converting to datetime and timedelta object
df = pd.read_csv(data,names=['date','delivery_time'],parse_dates=['date'])
df['delivery_time'] = pd.to_timedelta(df['delivery_time'])

#Grouping by date and then finding mean of delivery time
df.groupby(['date']).mean(numeric_only=False)

Output:

            delivery_time
date    
2018-01-01  0 days 00:49:11.200000

using dt.components you can easily extract hours minutes and seconds from your timedelta column. Something like following should generate your result.

df.groupby(['date']).mean(numeric_only=False)['delivery_time'].dt.components.iloc[:, 1:4]

sophocles · Answer 2 · 2021-01-24T15:33:42.233

This will get you the average delivery time (in timedelta[ns]) per day in a new dataframe:

Convert the total delivery time to total seconds with numpy's timedelta64[ns]

import pandas as pd
import numpy as np

df['del_time_sec'] = df[:8]['delivery_time'] / np.timedelta64(1, 's')

Get the average time, using GroupBy.transform, per day using pandas.Series.dt.day

df['avg_sec'] = (df.groupby(df['date'].dt.day)['del_time_sec'].transform('mean'))

Convert back to datetime, and get your answer in a new dataframe:

df['AVG_del_time_perday'] = pd.to_timedelta(df['avg_sec'], unit='s')
res = df[['date','AVG_del_time_perday']].drop_duplicates()

res
Out[116]: 
        date       AVG_del_time_perday
0 2018-01-01    0 days 00:49:11.200000
5 2018-12-31 0 days 00:24:16.666666667

How can i get the average delivery time (stored as timedelta64[ns]) per day on Pandas?

2 Answers2