3

I am trying to convert a series of date strings to dates. But I found that even with 'exact' parameter set to True, when there is no day value in the string the .to_datetime is adding a default value of 01.

PS: I am using pandas 0.24.2

data = np.array(['2014-10','2015-06/07','20-14-08','2a115-09'])

ser = pd.Series(data)

pd.to_datetime(ser, errors='coerce', format='%Y-%m-%d', exact=True)

#Result:
0   2014-10-01
1          NaT
2          NaT
3          NaT
dtype: datetime64[ns]

#Expected
0   NaT
1   NaT
2   NaT
3   NaT
dtype: datetime64[ns]
Sharath
  • 73
  • 3

1 Answers1

1

You can try to conditionally check if the date format is present with str.match and np.where:

notice I added one more date to showcase the correct result

data = np.array(['2014-10','2015-06/07','20-14-08','2a115-09', '2018-09-20'])

m = pd.Series(data).str.match('\d{4}-\d{2}-\d{2}')
data_new = pd.Series(np.where(m, 
                              pd.to_datetime(data, format='%Y-%m-%d',errors='coerce'), 
                              np.datetime64('NaT')))
0          NaT
1          NaT
2          NaT
3          NaT
4   2018-09-20
dtype: datetime64[ns]
Erfan
  • 40,971
  • 8
  • 66
  • 78