0

Consider the following Polars dataframe:


df = pl.DataFrame({
   'date': ['2022-01-01', '2022-02-01', '2022-03-01']
})

df.with_column(pl.col('date').str.strptime(pl.Date, fmt='%Y-%m').cast(pl.Datetime)).alias('year-month')

Currently the dataframe is:

date
2022-01-01
2022-02-01
2022-03-01

The desired output is:

date year-month
2022-01-01 2022-01
2022-02-01 2022-02
2022-03-01 2022-03

I have tried:


df.with_column(pl.col('date').str.strptime(pl.Date, fmt='%Y-%m').cast(pl.Datetime)).alias('year-month')

I get:

SchemaError: invalid series dtype: expected Utf8, got datetime[ns]

Thanks so much!!!

Vanessa
  • 89
  • 6
  • I don't think a pl.Date time can just be a year & month, it has to have a day of month as well. – Wayoshi Apr 01 '23 at 04:18

1 Answers1

2

You'll need to first parse it as Date, and then format it according to your desired format:

In [9]: df.with_columns(pl.col('date').str.strptime(pl.Date)).with_columns(year_month=pl.col('date').dt.strftime('%Y-%m'))
Out[9]:
shape: (3, 2)
┌────────────┬────────────┐
│ date       ┆ year_month │
│ ---        ┆ ---        │
│ date       ┆ str        │
╞════════════╪════════════╡
│ 2022-01-01 ┆ 2022-01    │
│ 2022-02-01 ┆ 2022-02    │
│ 2022-03-01 ┆ 2022-03    │
└────────────┴────────────┘

Note that once you format it as %Y-%m, it'll no longer be of Date dtype

ignoring_gravity
  • 6,677
  • 4
  • 32
  • 65