0

in pandas the following creates a new column in dataframe by dividing by two existing columns. How do I do this in polars? Bonus if done in the fastest way using polars.LazyFrame

df = pd.DataFrame({"col1":[10,20,30,40,50], "col2":[5,2,10,10,25]})
df["ans"] = df["col1"]/df["col2"]
print(df)
tommyt
  • 309
  • 5
  • 15

1 Answers1

2

You want to avoid Pandas-style coding and use Polars Expressions API. Expressions are the heart of Polars and yield the best performance.

Here's how we would code this using Expressions, including using Lazy mode:

(
    df
    .lazy()
    .with_column(
        (pl.col('col1') / pl.col('col2')).alias('result')
    )
    .collect()
)
shape: (5, 3)
┌──────┬──────┬────────┐
│ col1 ┆ col2 ┆ result │
│ ---  ┆ ---  ┆ ---    │
│ i64  ┆ i64  ┆ f64    │
╞══════╪══════╪════════╡
│ 10   ┆ 5    ┆ 2.0    │
├╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
│ 20   ┆ 2    ┆ 10.0   │
├╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
│ 30   ┆ 10   ┆ 3.0    │
├╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
│ 40   ┆ 10   ┆ 4.0    │
├╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
│ 50   ┆ 25   ┆ 2.0    │
└──────┴──────┴────────┘

Here's a section of the User Guide that may help transitioning from Pandas-style coding to using Polars Expressions.

David Waterworth
  • 2,214
  • 1
  • 21
  • 41