3

Fairly new to python-polars.

How does it compare to Rs {data.table} package in terms of memory usage?

How does it handle shallow copying?

Is in-place/by reference updating possible/the default?

Are there any recent benchmarks on memory efficiency of the big 4 in-mem data wrangling libs (polars vs data.table vs pandas vs dplyr)?

persephone
  • 380
  • 2
  • 10

1 Answers1

3

How does it handle shallow copying?

Polars memory buffers are reference counted Copy on Write. That means you can never do a full data copy within polars.

Is in-place/by reference updating possible/the default?

No, you must reassign the variable. Under the hood polars' may reuse memory buffers, but that is not visible to the users.

Are there any recent benchmarks on memory efficiency

The question how it relates in memory usage is also not doing respect to design differences. Polars currently is developing an out-of-core engine. This engine doesn't process all data in memory, but will stream data from disk. The design philosophy of that engine is to use as much memory as needed without going OOM. Unused memory, is wasted potential.

ritchie46
  • 10,405
  • 1
  • 24
  • 43