df = pl.DataFrame(
{
"era": ["01", "01", "02", "02", "03", "03"],
"pred1": [1, 2, 3, 4, 5,6],
"pred2": [2,4,5,6,7,8],
"pred3": [3,5,6,8,9,1],
"something_else": [5,4,3,67,5,4],
}
)
pred_cols = ["pred1", "pred2", "pred3"]
ERA_COL = "era"
I'm trying to do an equivalent to pandas rank percentile on Polars. Polars' rank
function lacks the pct
flag Pandas has.
I looked at another question here: how to replace pandas df.rank(axis=1) with polars
But the results from the question (and applying it to my code), have something off. Calculating rank percentage in Pandas, gives me a single float, the example Polars provided gives me an array, not a float, so something different is being calculated on the example.
As an example, Pandas code is this one:
df[list(pred_cols)] = df.groupby(ERA_COL, group_keys=False).apply(
lambda d: d[list(pred_cols)].rank(pct=True)
)