Series
For a single Series, you can use the set
method.
import polars as pl
my_series = pl.Series(['cow', 'cat', '', 'lobster', ''])
my_series.set(my_series.str.lengths() == 0, None)
shape: (5,)
Series: '' [str]
[
"cow"
"cat"
null
"lobster"
null
]
DataFrame
For DataFrames, I would suggest using when/then/otherwise
. For example, with this data:
df = pl.DataFrame({
'str1': ['cow', 'dog', "", 'lobster', ''],
'str2': ['', 'apple', "orange", '', 'kiwi'],
'str3': ['house', '', "apartment", 'condo', ''],
})
df
shape: (5, 3)
┌─────────┬────────┬───────────┐
│ str1 ┆ str2 ┆ str3 │
│ --- ┆ --- ┆ --- │
│ str ┆ str ┆ str │
╞═════════╪════════╪═══════════╡
│ cow ┆ ┆ house │
├╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┤
│ dog ┆ apple ┆ │
├╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┤
│ ┆ orange ┆ apartment │
├╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┤
│ lobster ┆ ┆ condo │
├╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┤
│ ┆ kiwi ┆ │
└─────────┴────────┴───────────┘
We can run a replacement on all string columns as follows:
df.with_columns([
pl.when(pl.col(pl.Utf8).str.lengths() ==0)
.then(None)
.otherwise(pl.col(pl.Utf8))
.keep_name()
])
shape: (5, 3)
┌─────────┬────────┬───────────┐
│ str1 ┆ str2 ┆ str3 │
│ --- ┆ --- ┆ --- │
│ str ┆ str ┆ str │
╞═════════╪════════╪═══════════╡
│ cow ┆ null ┆ house │
├╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┤
│ dog ┆ apple ┆ null │
├╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┤
│ null ┆ orange ┆ apartment │
├╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┤
│ lobster ┆ null ┆ condo │
├╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┤
│ null ┆ kiwi ┆ null │
└─────────┴────────┴───────────┘
The above should be fairly performant.
If you only want to replace empty strings with null on certain columns, you can provide a list:
only_these = ['str1', 'str2']
df.with_columns([
pl.when(pl.col(only_these).str.lengths() == 0)
.then(None)
.otherwise(pl.col(only_these))
.keep_name()
])
shape: (5, 3)
┌─────────┬────────┬───────────┐
│ str1 ┆ str2 ┆ str3 │
│ --- ┆ --- ┆ --- │
│ str ┆ str ┆ str │
╞═════════╪════════╪═══════════╡
│ cow ┆ null ┆ house │
├╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┤
│ dog ┆ apple ┆ │
├╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┤
│ null ┆ orange ┆ apartment │
├╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┤
│ lobster ┆ null ┆ condo │
├╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┤
│ null ┆ kiwi ┆ │
└─────────┴────────┴───────────┘