0

Initially I've got

print(transactions.Date.loc[0])
>>>30/04/18

First I've done

transactions['Date'] = pd.to_datetime(transactions['Date'])

Now

print(transactions.Date.loc[0])
>>>2018-04-30 00:00:00

After that I check for an equality condition like so

transactions.Date.loc[0] + datetime.timedelta(days=1) == transactions.Date.loc[1]
>>>False

So I examine

print(transactions.Date.loc[0] + datetime.timedelta(days=1))
>>>2018-05-01 00:00:00 
print(transactions.Date.loc[1])
>>>2018-01-05 00:00:00

Turns out both refer to same time, but with a different format. How do I modify my code to get True on equality?

EDIT: As pointed in the comments, the problem is actually May 1 being transformed to January 5 when I use pd.to_datetime. So the new question is how do I resolve this?

Eeshaan
  • 1,557
  • 1
  • 10
  • 22

1 Answers1

1

The code worked fine, are you sure that the difference between transactions.Date.loc[0] andtransactions.Date.loc[1] is exactly 1 day?

>>> l = pd.to_datetime(['2000/23/11', '2000/24/11'], dayfirst=True)
>>> l[0] + datetime.timedelta(days=1) == l[1]
True
>>> print(l[0], l[0] + datetime.timedelta(days=1), l[1], sep='\t')
2000-11-23 00:00:00     2000-11-24 00:00:00     2000-11-24 00:00:00

Edit: using your data i've found the error:

with dayfirst=True works well

>>> l = pd.to_datetime(['30/4/20', '1/5/20'], dayfirst=True)
>>> l[0] - l[1]
Timedelta('-1 days +00:00:00')

without it the conversion format is different between the 2 dates

>>> l = pd.to_datetime(['30/4/20', '1/5/20'])
>>> l[0] - l[1]
Timedelta('116 days 00:00:00')
Hoxha Alban
  • 1,042
  • 1
  • 8
  • 12