Unfortunately, columns of type Object
are often a dead-end. From the Data Types section of the Polars User Guide:
Object: A limited supported data type that can be any value.
Since support is limited, operations on columns of type Object
often throw exceptions.
However, there may be a way to retrieve the values in this particular situation. As an example, let's purposely create a column of type object
.
import polars as pl
data_as_list = [[0.49981183], [0.49974033],
[0.4997973], [0.49973667], [0.49978396]]
df = pl.DataFrame([
pl.Series("X", values=data_as_list, dtype=pl.Object),
])
print(df)
shape: (5, 1)
┌──────────────┐
│ X │
│ --- │
│ object │
╞══════════════╡
│ [0.49981183] │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ [0.49974033] │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ [0.4997973] │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ [0.49973667] │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ [0.49978396] │
└──────────────┘
This approach may work...
def attempt_recover(series: pl.Series) -> pl.Series:
return pl.Series(values=[val[0] for val in series])
df.with_column(pl.col("X").map(attempt_recover).alias("X_recovered"))
shape: (5, 2)
┌──────────────┬─────────────┐
│ X ┆ X_recovered │
│ --- ┆ --- │
│ object ┆ f64 │
╞══════════════╪═════════════╡
│ [0.49981183] ┆ 0.499812 │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ [0.49974033] ┆ 0.4997 │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ [0.4997973] ┆ 0.4997973 │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ [0.49973667] ┆ 0.499737 │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ [0.49978396] ┆ 0.499784 │
└──────────────┴─────────────┘
Try this first on a tiny subset of your data. This may not work. (And it will not be fast.)
What you'll want to do is alter the way that model prediction results from Keras are loaded into Polars to prevent getting a column of type Object
. (Often this means indexing an array/list output to extract the number from the array/list before loading into Polars.)