0

Parse the correct datetime for multiple columns and different formats in the same column

Hello community,

I have a csv file that I uploaded to jupyter notebook using read_csv from pandas.

I am trying to parse correctly date-times for multiple columns.

In the same column, I have two formats of DateTime:

1- 2022-01-05T02:24

2- 2022-01-05T02:10:58

I have tried this method:

dateparse = lambda x: datetime.strptime(x, '%Y-%m-%dT%H:%M:%S')

df= pd.read_csv("data.csv", sep=",", encoding = "ISO-8859-1", low_memory=False, date_parser=dateparse)

But it doesn't parse right even this format "2022-01-05T02:10:58" I want to ask you if there is an alternative to use to parse correctly my columns of type DateTime.

I am using the low_memory= False because when I don't use it, I have this warning:

DtypeWarning: Columns (28,29,56,71) have mixed types. Specify dtype option on import or set low_memory=False

Thank you,

candy
  • 3
  • 3

1 Answers1

0

Pandas has a very powerful method pandas.to_datetime()

import pandas as pd
df = pd.DataFrame({
    'date_string': ['2022-01-05T02:24', '2022-01-05T02:10:58'],
    'value': [12, 17]
})
df['date'] = pd.to_datetime(df['date_string'])

would result in the following dataframe:

    date_string           value     date
0   2022-01-05T02:24      12        2022-01-05 02:24:00
1   2022-01-05T02:10:58   17        2022-01-05 02:10:58

You can check that the date column has been parsed correctly by verifying the datatype of the df['date'] Series:

0   2022-01-05 02:24:00
1   2022-01-05 02:10:58
Name: date, dtype: datetime64[ns]
Dima Chubarov
  • 16,199
  • 6
  • 40
  • 76
  • thank you, it was easier to use. i have made adjustments to cover all my columns: def convert_to_datetime(df): for col in df.filter(regex='date',axis=1).columns: df[col]= pd.to_datetime(df[col], errors='coerce') return df – candy Mar 14 '23 at 14:06