2

I have two fields in a dataframe, both of which are datetime64[ns]

I thought I could just do this...

df_hist['MonthsBetween'] = (df_hist.Last_Date - df_hist.Begin_Time) / pd.Timedelta(months=1)

One field has only a data and one has a date/time, but both are of datatype datetime64[ns]. I Googled this and it seems like it should work, but I'm getting an error message saying:

TypeError: '<' not supported between instances of 'str' and 'int'

I thought these were both datetime64[ns], and neither str or int.

Surjit Samra
  • 4,614
  • 1
  • 26
  • 36
ASH
  • 20,759
  • 19
  • 87
  • 200

1 Answers1

1

You can use to_period to take months and then subtract

    df_hist['MonthsBetween'] = df_hist['Last_Date'].dt.to_period('M').astype(int) - df_hist['Begin_Time'].dt.to_period('M').astype(int)
df
Out[123]: 
                 Last_Date      Begin_Time  Months
0  2022-01-01 00:00:00 2022-03-01       2
1  2022-01-01 12:00:00 2022-03-02       2
2  2022-01-02 00:00:00 2022-03-03       2
3  2022-01-02 12:00:00 2022-03-04       2
4  2022-01-03 00:00:00 2022-03-05       2
5  2022-01-03 12:00:00 2022-03-06       2
6  2022-01-04 00:00:00 2022-03-07       2
7  2022-01-04 12:00:00 2022-03-08       2

dtypes of date columns should be datetime64[ns]

df_hist.dtypes
Out[125]: 
Last_Date     datetime64[ns]
Begin_Time     datetime64[ns]
Months             int64
dtype: object
Surjit Samra
  • 4,614
  • 1
  • 26
  • 36
  • I should have mentioned it before. I tried that, or something very similar, and I got this error. AttributeError: 'DataFrame' object has no attribute 'dt' – ASH Jan 03 '23 at 21:20
  • what does df_hist.dtypes gives you ? – Surjit Samra Jan 03 '23 at 21:28
  • Now...I'm getting....ValueError: cannot assemble with duplicate keys I've never seen this before... – ASH Jan 03 '23 at 22:22
  • 1
    Ultimately, this worked for me. df_hist['MonthsBetween'] = ((df_hist.Last_Date - df_hist.Begin_Time)/np.timedelta64(1, 'M')) The data types are object now! – ASH Jan 03 '23 at 23:26