Consider the following example
zz = pl.DataFrame({'group' : ['a','a','a','a','b','b','b'],
'col' : [1,2,3,4,1,3,2]})
zz
Out[16]:
shape: (7, 2)
┌───────┬─────┐
│ group ┆ col │
│ --- ┆ --- │
│ str ┆ i64 │
╞═══════╪═════╡
│ a ┆ 1 │
│ a ┆ 2 │
│ a ┆ 3 │
│ a ┆ 4 │
│ b ┆ 1 │
│ b ┆ 3 │
│ b ┆ 2 │
└───────┴─────┘
I am trying to create a binned variable by group, essentially replicating a pandas qcut
by group. This is easy in Pandas, as shown here:
xx = pl.DataFrame({'group' : ['a','a','a','a','b','b','b'],
'col' : [1,2,3,4,1,3,2]}).to_pandas()
xx.groupby('group').col.transform(lambda x: pd.qcut(x, q = 2, labels = False))
Out[18]:
0 0
1 0
2 1
3 1
4 0
5 1
6 0
Name: col, dtype: int64
But how to do this in Polars? Thanks!