0

I have an excel spreadsheet(.xlsx) with a date of birth column as below.

enter image description here

On loading it using the below syntax:

pl.read_excel(r'C:\datos\test.xlsx',read_csv_options={'parse_dates':False})

the date of births are changing into two digit year format as below.

enter image description here

How to avoid this ? I would require the date format as the source format i.e four digit year MM/DD/YYYY.

myamulla_ciencia
  • 1,282
  • 1
  • 8
  • 30

2 Answers2

2

One way is to parse the string dates and transform to date type:

df = pl.read_excel(r'C:\datos\test.xlsx',read_csv_options={'parse_dates':False})

df = df.with_column(pl.col('date_of_birth').str.strptime(pl.Date, fmt='%m-%d-%y').cast(pl.Datetime))

print(df)
┌─────────────────────┐
│ date_of_birth       │
│ -------------       │
│ datetime[μs]        │
╞═════════════════════╡
│ 2002-04-10 00:00:00 │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 1997-02-09 00:00:00 │
└─────────────────────┘
Azhar Khan
  • 3,829
  • 11
  • 26
  • 32
-3

Assuming your dataframe has loaded your date column as a string, I would first try

import pandas as pd
df['date_col'] = pd.to_datetime(df['date_col'],dayfirst=True, format='%m/%d/%Y')  

where df is your dataframe and 'date_col' is the column name of your date column.

Dizzy
  • 21
  • 6