3

I am trying to group a dataframe by year of the date column. First, let's create a dataframe:

let s0 = Series::new("date", &["2021-01-14","2022-04-09","2021-06-24","2022-12-04","2022-11-25"]);
let s1 = Series::new("values", &[1, 2, 3, 4, 5]);
let mut df = DataFrame::new(vec![s0, s1])?;
df.try_apply("date", |col_series| {Ok(col_series.utf8().unwrap().as_date(Some("%Y-%m-%d")).unwrap().into_series())});
let lf = df.lazy();

And then here' the (non-working) code of what I would like to achieve.

lf.groupby([col("date").year()]).agg([col("values").sum()]).collect()
ste_kwr
  • 820
  • 1
  • 5
  • 21

1 Answers1

3

We can go to the date namespace for our "date" column by calling col("date").dt().

This makes the year() function available. In your case the correct code would be

lf.groupby([col("date").dt().year()])
        .agg([col("values").sum()])
        .collect();

This was taken from the filtering section in the python docs

https://pola-rs.github.io/polars-book/user-guide/dsl/groupby.html#filtering

Kival M
  • 182
  • 1
  • 10