1

I'm having trouble to format time units output of diff() function.

Here is the code:

import pandas as pd
from numpy import random

df = pd.DataFrame(data = random.randn(5,4), index = ['A','B','C','D','E'],
columns = ['W','X','Y','Z'])

df['W'] = ['10/01/2018 12:00:00','10/03/2018 13:00:00',
           '10/03/2018 12:30:00','10/04/2018 12:05:00',
           '10/08/2018 12:00:15']

df['W']=pd.to_datetime(df['W'])
df['delta']=df['W'].diff()
df

This is what I get ('delta' column):

    W           X           Y           Z           delta
A   2018-10-01  0.218683    1.704266    1.035627    NaT
B   2018-10-03  -1.362903   1.251404    -0.296558   2 days 01:00:00
C   2018-10-03  1.288930    -1.692359   1.185029    -1 days +23:30:00
D   2018-10-04  1.355021    1.144945    -1.294918   0 days 23:35:00
E   2018-10-08  -0.572535   0.236500    -0.435992   3 days 23:55:15

This is what I would like to get in 'delta' column:

    W           X           Y           Z           delta
A   2018-10-01  0.218683    1.704266    1.035627    NaT
B   2018-10-03  -1.362903   1.251404    -0.296558   2.04
C   2018-10-03  1.288930    -1.692359   1.185029    -0.02
D   2018-10-04  1.355021    1.144945    -1.294918   0.98
E   2018-10-08  -0.572535   0.236500    -0.435992   3.99

Any ideas?

Thanks for the help!

user9185511
  • 724
  • 1
  • 8
  • 18

1 Answers1

2

Convert timedeltas to days by .Series.dt.total_seconds, divide 86400 for 60 * 60 *24 and last round:

df['delta']=df['W'].diff().dt.total_seconds().div(86400).round(2)
print (df)
                    W         X         Y         Z  delta
A 2018-10-01 12:00:00  0.821455  1.481278  1.331864    NaN
B 2018-10-03 13:00:00  0.685609  0.573761  0.287728   2.04
C 2018-10-03 12:30:00  0.953490 -1.689625 -0.344943  -0.02
D 2018-10-04 12:05:00 -0.514984  0.244509 -0.189313   0.98
E 2018-10-08 12:00:15  0.464802  0.845930 -0.503542   4.00
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252