Hi I'm working on this fan fiction project of a full feature + syntax translation of pypolars to R called "minipolars".
I understand the pypolars API e.g. DataFrame in generel elicits immutable-behavior or isch the same as 'copy-on-write' behaviour. Most methods altering the DataFrame object will return a cheap copy. Exceptions known to me are DataFrame.extend and the @columns.setter. In R, most API's strive for a strictly immutable-behavior. I imagine to both support a strictly immutable behavoir, and optional pypolars-like behavior. Rust-polars API has many mutable operations + lifetimes and what not, but it is understandably all about performance and expressiveness.
- Are there many more central mutable behavoirs in the pypolars-API?
- Would a pypolars-API with only immutable behavior suffer in performance and expressiveness?
The R library data.table
API do stray away from immutable-behavoir some times. However all such operations that are mutable are prefixed set_
or use the set-operator :=
.
- Is there an obvious way in pypolars to recognize if an operation is mutable or not?
By mutable-behavoir I think of e.g. executing the method .extend()
after defining variable df_mutable_copy
and that still affects the value df_mutable_copy
.
import polars as pl
df1 = pl.DataFrame({"foo": [1, 2, 3], "bar": [4, 5, 6]})
df2 = pl.DataFrame({"foo": [10, 20, 30], "bar": [40, 50, 60]})
df_copy = df1
df_copy_old_shape = df_copy.shape
df1.extend(df2)
df_copy_new_shape = df_copy.shape
#extend was a operation with mutable behaviour df_copy was affected.
df_copy_old_shape != df_copy_new_shape