In Polars, we would not use a pivot table for this. Instead, we would use the groupby
and agg
functions. Using your data, it would be:
import polars as pl
df = pl.from_pandas(df)
df.groupby("obj").agg(pl.all().n_unique())
shape: (2, 4)
┌──────┬───────┬───────┬──────┐
│ obj ┆ price ┆ value ┆ date │
│ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ u32 ┆ u32 ┆ u32 │
╞══════╪═══════╪═══════╪══════╡
│ ring ┆ 1 ┆ 2 ┆ 2 │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌╌┤
│ shoe ┆ 1 ┆ 1 ┆ 1 │
└──────┴───────┴───────┴──────┘
pivot and melt
Where we would use the pivot
function in Polars is to summarize a dataset in 'long' format to a dataset in 'wide' format. As an example, let's convert your original dataset to 'long' format using the melt
function.
df2 = df.melt(id_vars="obj", value_vars=["price", "date", "value"])
print(df2)
shape: (9, 3)
┌──────┬──────────┬────────────┐
│ obj ┆ variable ┆ value │
│ --- ┆ --- ┆ --- │
│ str ┆ str ┆ str │
╞══════╪══════════╪════════════╡
│ ring ┆ price ┆ 65 │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┤
│ shoe ┆ price ┆ 42 │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┤
│ ring ┆ price ┆ 65 │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┤
│ ring ┆ date ┆ 2022-02-07 │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┤
│ shoe ┆ date ┆ 2022-01-07 │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┤
│ ring ┆ date ┆ 2022-03-07 │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┤
│ ring ┆ value ┆ 53 │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┤
│ shoe ┆ value ┆ 55 │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┤
│ ring ┆ value ┆ 54 │
└──────┴──────────┴────────────┘
Now let's use pivot
to summarize this 'long' format dataset back to one in "wide" format and simply count the number of values.
df2.pivot(values='value', index='obj', columns='variable', aggregate_fn='count')
shape: (2, 4)
┌──────┬──────┬───────┬───────┐
│ obj ┆ date ┆ price ┆ value │
│ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ u32 ┆ u32 ┆ u32 │
╞══════╪══════╪═══════╪═══════╡
│ ring ┆ 2 ┆ 2 ┆ 2 │
├╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┤
│ shoe ┆ 1 ┆ 1 ┆ 1 │
└──────┴──────┴───────┴───────┘
Does this help clarify the use of the pivot
functionality?