2

Here is the code I have so far:

import pandas as pd

df = pd.read_csv('/content/drive/MyDrive/Colab Datasets/KickstarterRevised.csv')
df['deadline'] = pd.to_datetime(df['deadline'])
df['launched'] =  pd.to_datetime(df['launched'])

df['difference'] = df['deadline'].sub(df['launched'], axis=0)
df['difference']
0      58 days 23:24:00
1      45 days 00:00:00
2      30 days 01:00:00
3      55 days 16:25:00
4      35 days 00:00:00
             ...       
4994   40 days 00:00:00
4995    8 days 10:50:00
4996   38 days 18:53:00
4997   30 days 00:00:00
4998   30 days 00:00:00
Name: difference, Length: 4999, dtype: timedelta64[ns]
FObersteiner
  • 22,500
  • 8
  • 42
  • 72
  • 2
    To get days, you could use the `days` attribute of the timedelta, or `.total_seconds()/86400` to get fractional days. You can access both via the `dt` accessor. – FObersteiner Oct 14 '21 at 05:49

1 Answers1

0

As you see from your code, df['difference'] is a Series with dtype: timedelta64[ns]. To get the days, just use .astype("timedelta64[D]"), see below

df['difference'] = df['deadline'].sub(df['launched'], axis=0).astype('timedelta64[D]')
scandav
  • 749
  • 1
  • 7
  • 21