5

I have the following dataframe :

**flashtalking_df =**

+--------------+--------------------------+------------------------+
| Placement ID | Average Interaction Time | Total Interaction Time |
+--------------+--------------------------+------------------------+
|      2041083 | 00:01:04.12182           | 24:29:27.500           |
|      2041083 | 00:00:54.75043           | 52:31:48.89108         |
+--------------+--------------------------+------------------------+

where 00:01:04.12182 = HH:MM:SS.F

I need to convert both columns, Average Interaction Time, and Total Interaction Time into seconds.

The problem is that Total Interaction Time goes over 24h.

I found the following code which works for the most part. However, when the Total Interaction Time goes over 24h, it gives me

ValueError: time data '24:29:27.500' does not match format '%H:%M:%S.%f'

This is the function I am currently using, which I grabbed from another Stack Overflow question, for both Average Interaction Time and Total Interaction Time:

flashtalking_df['time'] = flashtalking_df['Total Interaction Time'].apply(lambda x: datetime.datetime.strptime(x,'%H:%M:%S.%f'))
flashtalking_df['timedelta'] = flashtalking_df['time'] - datetime.datetime.strptime('00:00:00.00000','%H:%M:%S.%f')
flashtalking_df['Total Interaction Time'] = flashtalking_df['timedelta'].apply(lambda x: x / np.timedelta64(1, 's'))

If there's an easier way, please let me know.

Thank you for all your help

Matteo M
  • 137
  • 2
  • 8

1 Answers1

8

I think you need first convert to_timedelta and then to seconds by astype:

df['Average Interaction Time'] = pd.to_timedelta(df['Average Interaction Time'])
                                   .astype('timedelta64[s]')
                                   .astype(int)

df['Total Interaction Time'] =   pd.to_timedelta(df['Total Interaction Time'])
                                   .astype('timedelta64[s]')
                                   .astype(int)
                                   .map('{:,.2f}'.format)
print (df)
   Placement ID  Average Interaction Time Total Interaction Time
0       2041083                        64              88,167.00
1       2041083                        54             189,108.00

Solution with total_seconds, thank you NickilMaveli:

df['Average Interaction Time'] = pd.to_timedelta(df['Average Interaction Time'])
                                   .dt.total_seconds()
                                   .map('{:,.2f}'.format)
df['Total Interaction Time'] =   pd.to_timedelta(df['Total Interaction Time'])
                                   .dt.total_seconds()
                                   .map('{:,.2f}'.format)

print (df)   
   Placement ID Average Interaction Time Total Interaction Time
0       2041083                    64.12              88,167.50
1       2041083                    54.75             189,108.89
Community
  • 1
  • 1
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • Hi Jezrael! Thanks for the help. Is there any way to add a decimal breakdown as well (eg. 88167 becomes 88,167.50) – Matteo M Nov 08 '16 at 11:26
  • Thank you so much jezreal! On a sidenote. Would you reccomend any course or resource that I can take or reference when I encounter this type of stops so that I can resolve them myself? – Matteo M Nov 08 '16 at 11:46
  • Hi jezrael, one last thing. The conversion doesn't seem to work even with the .map('{:,.2f}'.format If we take 00:01:04.12182 the result should be 64.12, but the output is still 64 – Matteo M Nov 08 '16 at 11:56
  • Hard question - maybe `pandas` docs is very [nice](http://pandas.pydata.org/pandas-docs/stable/timedeltas.html). And course unfortunately I have no idea. – jezrael Nov 08 '16 at 11:56
  • Do you use python `2` or `python 3` ? – jezrael Nov 08 '16 at 11:57
  • Hey, I use Python 3 – Matteo M Nov 08 '16 at 12:01
  • Interesting, for me it works nice with `pandas 0.19.1`, check you version by `print (pd.show_info())` – jezrael Nov 08 '16 at 12:02
  • 2
    @jezrael, Looks like the OP wants to obtain the entire precision of decimal digits going by the responses - Something like `pd.to_timedelta(df['Total Interaction Time']).apply(lambda td: td.total_seconds())` would suffice. – Nickil Maveli Nov 08 '16 at 12:05
  • 1
    @NickilMaveli - thank you, i add it to answer (little modified) – jezrael Nov 08 '16 at 12:09