I have the following code
import polars as pl
df = pl.DataFrame(
{
"grpbyKey": [1, 1, 1, 2, 2, 2],
"val": ["One"] * 3 + ["Two"] * 3
}
)
df2 = pl.DataFrame(
{
"grpbyKey": [1, 1, 2, 2, 2, 3],
"val2": ["One"] * 2 + ["Two"] * 3 + ["Three"]
}
)
c = df.lazy().with_context(df2.lazy())
result = c.groupby("grpbyKey").agg([pl.all()]).collect()
print(result)
It gives the following result:
shape: (2, 3)
┌──────────┬───────────────────────┬─────────────────────────┐
│ grpbyKey ┆ val ┆ val2 │
│ --- ┆ --- ┆ --- │
│ i64 ┆ list[str] ┆ list[str] │
╞══════════╪═══════════════════════╪═════════════════════════╡
│ 1 ┆ ["One", "One", "One"] ┆ ["One", "One", "Two"] │
│ 2 ┆ ["Two", "Two", "Two"] ┆ ["Two", "Two", "Three"] │
└──────────┴───────────────────────┴─────────────────────────┘
I was hoping to see
shape: (2, 3)
┌──────────┬───────────────────────┬─────────────────────────┐
│ grpbyKey ┆ val ┆ val2 │
│ --- ┆ --- ┆ --- │
│ i64 ┆ list[str] ┆ list[str] │
╞══════════╪═══════════════════════╪═════════════════════════╡
│ 1 ┆ ["One", "One", "One"] ┆ ["One", "One"] │
│ 2 ┆ ["Two", "Two", "Two"] ┆ ["Two", "Two", "Two"] │
└──────────┴───────────────────────┴─────────────────────────┘
I.e. it groups both dataframes at the same time. Within the groupby I intend to run a custom function on the two.
Is there a way to get the groupby to give me both frames grouped?
I would like to use the polars API since I intend to implement this in Rust eventually, so no Python hacks please.