0

I am currently doing a groupby and ranking values in Polars:

let df = df.clone().lazy().select([
    all(), 
    col("value").rank(rank_opts).over(["groupby_id"]).alias("rank")])
.collect().unwrap();

But I am finding it to be pretty slow. I am trying a new method, which I was using in R because it was much faster than ranking, where I sort by value, group, and then assign the sequence 1:group_size. With R's datatable it looks like this:

data_table[, rank := seq_len(.N), keyby=groupby_id]

Here, .N calculates the size of the group.

How can I assign a new column which equivalent to 1:group_size for each group?

Jage
  • 453
  • 2
  • 9

0 Answers0