0

I have a Dataframe with a datetimeindex and I need to create a column that contains the difference in time between the rows of the datetimeindex expressed in hours. This is what I have:

Datetime            Numbers
2020-11-27 08:30:00    1
2020-11-27 13:00:00    2
2020-11-27 15:15:00    3
2020-11-27 20:45:00    4
2020-11-28 08:45:00    5
2020-11-28 10:45:00    6
2020-12-01 04:00:00    7
2020-12-01 08:15:00    8
2020-12-01 12:45:00    9
2020-12-01 14:45:00   10
2020-12-01 17:15:00   11
         ...

This is what I need:

Datetime             Numbers   Delta
2020-11-27 08:30:00    1        Nan
2020-11-27 13:00:00    2       4.5
2020-11-27 15:15:00    3       2.25
2020-11-27 20:45:00    4       5.5
2020-11-28 08:45:00    5       12
2020-11-28 10:45:00    6       2
2020-12-01 04:00:00    7       65.25
2020-12-01 08:15:00    8       4.25
2020-12-01 12:45:00    9       4.5
2020-12-01 14:45:00   10       2
2020-12-01 17:15:00   11       2.5
                  ...

The Dataframe has thousands of rows so I can't use a "for" loop. Thanks in advance!

EDIT: I found a solution:

df = df.reset_index()
df['Time'] = df['Datetime'].astype(np.int64) // 10**9
df['Delta'] = df['Time'].diff()/3600
df.drop(columns=['Time'],inplace =True)
df.set_index('Datetime', inplace=True)
Gus
  • 193
  • 9

1 Answers1

1

I assume that Datetime is set as index:

df.reset_index(inplace=True)
df['Delta'] = df['Datetime'].diff().dt.total_seconds()/3600
df.set_index('Datetime', inplace=True)

OUTPUT:

                     Numbers  Delta
Datetime                           
2020-11-27 08:30:00        1    NaN
2020-11-27 13:00:00        2   4.50
2020-11-27 15:15:00        3   2.25
2020-11-27 20:45:00        4   5.50
2020-11-28 08:45:00        5  12.00
2020-11-28 10:45:00        6   2.00
2020-12-01 04:00:00        7  65.25
2020-12-01 08:15:00        8   4.25
2020-12-01 12:45:00        9   4.50
2020-12-01 14:45:00       10   2.00
2020-12-01 17:15:00       11   2.50
Muhammad Hassan
  • 4,079
  • 1
  • 13
  • 27
  • 1
    Please avoid answering duplicate questions. – Mayank Porwal Oct 24 '21 at 08:59
  • I was coming back to answer my own question to find out that Muhammad gave the same answer. Thanks, Muhammad you are the best. By the way Mayank is not the same question since in my case the solution does not work since it's a Datetime index and not a datetime column. – Gus Oct 24 '21 at 09:10
  • @Gus Please check the question I linked properly. It also uses datetime index not column. – Mayank Porwal Oct 24 '21 at 09:13
  • @Muhammad This gives me an error : df['Delta'] = df['Datetime'].diff().dt.total_seconds()/3600 TypeError: 'method' object is not subscriptable – Gus Oct 24 '21 at 09:41
  • Working fine for me, are you sure `df` is dataframe here? – Muhammad Hassan Oct 24 '21 at 10:03
  • @Muhammad You are right, it works. The problem I was having had to do with the fact that while in the example provided here I showed a Dataframe In my code it was actually a series so I convert it to a dataframe and it worked. – Gus Oct 26 '21 at 09:07
  • @Mayank Now that I converted it to a Dataframe this works: df['delta'] = df.index.to_series().diff() However it's not in the correct format, like I said it should be expressed in hours as a number. So again is still not the same issue. – Gus Oct 26 '21 at 09:14