2

I want to replace the inf values in a polars series with 0. I am using the polars Python library.

This is my example code:

import polars as pl

example = pl.Series([1,2,float('inf'),4])


This is my desired output:

output = pl.Series([1.0,2.0,0.0,4.0])

All similiar questions regarding replacements are regarding polars Dataframes using the .when expression (e.g Replace value by null in Polars) which does not seem to be available in a Series object:

AttributeError: 'Series' object has no attribute 'when'

Is this possible using polars expressions?

EDIT: I found the following solution but it seems very convoluted:

example.map_dict({float('inf'): 0 }, default= pl.first())
Sandwichnick
  • 1,379
  • 6
  • 13
  • 1
    You need a dataframe for expressions. `pl.select` simplifies this e.g. `pl.select(pl.when(example == float("inf")).then(0).otherwise(example))` - it results in a dataframe though, so you would need a `.to_series()`. Another option is `.zip_with` e.g. `example.zip_with(example != float("inf"), pl.Series([0]))` – jqurious May 02 '23 at 08:44
  • 1
    Why does `map_dict` seem convoluted if all you want to do is replace inf with 0? – Dean MacGregor May 02 '23 at 13:14
  • I don't know. It seems overkill to use a dictionary if I am just replacing a single value – Sandwichnick May 02 '23 at 22:22

2 Answers2

3
import polars as pl

example = pl.Series("example", [1, 2, float('inf'), 4])

# Create a DataFrame from the Series
df = pl.DataFrame([example])

# Replace inf values with 0 using the when expression
df = df.with_columns(
    pl.when(pl.col("example") == float('inf'))
    .then(0)
    .otherwise(pl.col("example"))
    .alias("example")
)

# Get the output Series
output = df["example"]

print(output)

Result:

shape: (4,)
Series: 'example' [f64]
[
    1.0
    2.0
    0.0
    4.0
]
Taras Drapalyuk
  • 473
  • 3
  • 6
2

You can use Series.set:

s = s.set(s == float('inf'), 0)

although there's a note there:

Use of this function is frequently an anti-pattern, as it can block optimisation (predicate pushdown, etc). Consider using pl.when(predicate).then(value).otherwise(self) instead.

which suggests using a way longer:

s = s.to_frame().select(polars.when(s == float('inf')).then(0).otherwise(s)).to_series()

which may or may not be worth it depending on your use case.

levant pied
  • 3,886
  • 5
  • 37
  • 56