0

I have a database that I made the difference between two dates, then I changed the format to be like : HH:MM:SS.

df['start'] type: datetime64[ns] df['end'] type datetime64[ns]

df['time'] = df['end'] - df['start']

df['time'] = type: timesdelta64

Then

def formatehours(interval): seconds = interval.total_seconds()

duration = dt.datetime.utcfromtimestamp(seconds) formatted = duration.strftime('%M:%S')

return '%02d:%s' % (seconds / (60 * 60), formatted)

So:

df['time'] = df['time'].apply(formatehours)

df['time'] = type: object

df['time']:

02:19:36 08:56:46 01:59:51

I need to calculate the median of this duration, but I can't because it's an object after it's been formatted.

Luan Brito
  • 19
  • 5
  • 1
    how is your database? you probably can substract the adjacent entries to get each delta. If you are using something like created_at = Column(DateTime(timezone=True), default=datetime.datetime.now()), then you can get delta in seconds like this timedata = int(datetime_delta.total_seconds()) – mhanuel Sep 27 '22 at 15:48
  • df['start'] type: datetime64[ns] df['end'] type datetime64[ns] df['time'] = df['end']- df['start'] df['time'] = type: timesdelta64 Then def formatehours(interval): seconds = interval.total_seconds() duration = dt.datetime.utcfromtimestamp(seconds) formatted = duration.strftime('%M:%S') return '%02d:%s' % (seconds / (60 * 60), formatted) So: df['time'] = df['time'].apply(formatehours) df['time'] = type: object df['time']: 02:19:36 08:56:46 01:59:51 Now, I wanna do a groupby then .median() – Luan Brito Sep 27 '22 at 17:06
  • Please **format** your code blocks appropriately, and do **not** post code in the comments! – desertnaut Oct 02 '22 at 23:27

0 Answers0