0

I tried several ways, I have a big DataFrame with lots of categories and each has a duration several times like this one. I need to sum them all. In excel this is easy, in python I always get errors.

[In]

dfMDA = df.loc[df['Category'].str.contains("MDA")]

dfMDA["Duration"] = pd.to_datetime(dfMDA["Duration"], format='%H:%M:%S')

dfMDA['Duration']

[Out]

2     1900-01-01 01:27:40
40    1900-01-01 00:03:45
52    1900-01-01 00:01:54
Name: Duration, dtype: datetime64[ns]

.sum() isnt working; I need the result of all these timestamps (result should be 01:33:19).

ALollz
  • 57,915
  • 7
  • 66
  • 89
Eric Stralsund
  • 541
  • 1
  • 5
  • 17
  • 2
    use `to_timedelta` instead of `to_datetime`: `dfMDA["Duration"] = pd.to_timedelta(dfMDA["Duration"])`. Then you can sum. – Quang Hoang Nov 20 '19 at 20:11
  • 1
    In addition to @roganjosh, it might also help to be able to strip the time from the dates, which is described here : https://stackoverflow.com/questions/24567078/how-to-remove-the-date-information-in-a-column-just-keep-time – Tim.Lucas Nov 20 '19 at 20:16
  • Quang Hoang ,thank you so much.... perfect! – Eric Stralsund Nov 20 '19 at 22:45

0 Answers0