i am trying to convert my two date columns in my dataframe. however some of the date is converted using "%d/%m/%Y" and a few of the data is getting converted using "%m/%d/%Y" . the issue is happening only from 01-May-2023 to 12-May-2023. from 13 may it is again reverting to using "%d/%m/%Y".
I am using the following to convert my data:
columns_to_convert_to_date= ['Date','Value Dt']
regex_pattern = r'\d{2}/\d{2}/\d{4}'
for column in columns_to_convert_to_date:
full_bank_df_hdfc[column] = full_bank_df_hdfc[column].apply(lambda x: pd.to_datetime(x, format='%d/%m/%Y', errors='coerce') if re.match(regex_pattern, str(x)) else pd.to_datetime(x, errors='coerce'))
screenshot data before transformation: enter image description here
data after transformation: enter image description here
i have tried to force it to use the format to no avail. when using only
pd.to_datetime(full_bank_df_hdfc['Date'],format='%d/%m/%y', errors='coerce')
am i getting the desired output in the column , however that is resulting in my already existing datetime format roes to be NaT
1043 2023-04-30 1044 2023-05-01 1045 2023-05-01 1046 2023-05-01 1047 2023-05-02 1048 2023-05-02 1049 2023-05-03 1050 2023-05-03 1051 2023-05-03 1052 2023-05-04 1053 2023-05-04 1054 2023-05-06 1055 2023-05-06 1056 2023-05-07 1057 2023-05-08