Questions tagged [python-polars]

Polars is a DataFrame library/in-memory query engine.

The Polars core library is written in Rust and uses Arrow, the native arrow2 Rust implementation, as its foundation. It offers Python and JavaScript bindings, which serve as a wrapper for functionality implemented in the core library.

Links

1331 questions
0
votes
2 answers

How to get hash of string column in Polars or Pyarrow

I have a Pandas DataFrame/Polars dataframe / Pyarrow table with a string key column. You can assume the strings are random. I want to partition that dataframe into N smaller dataframes based on this key column. With an integer column, I can just use…
bumpbump
  • 542
  • 4
  • 17
0
votes
1 answer

Maximum number of columns shown when formatting DataFrames in python-polars?

how set maximum number of columns shown when formatting DataFrames ? i found that in rust-polars i need to modify "POLARS_FMT_MAX_COLS" env variable. how can i do this in python ?
Ar A
  • 1
  • 1
0
votes
0 answers

changing dtype in polars

i created a data frame using polars. when datas are inserted, dtype of the coulmn automatically changes to what inserted. (i think its a feature of polars?) but how do you change the dtype of speicfic table? for example "name" has f32 by default…
wookidookik123
  • 79
  • 2
  • 10
0
votes
2 answers

Cumulative sum that resets when turning negative/positive

[enter image description here] I am trying to add a column (column C) to my polars dataframe that counts how many times a value of one of the dataframe's columns (column A) is greater/less than the value of another column (column B). Once the value…
bruppfab
  • 3
  • 2
0
votes
1 answer

Convert column containing single element arrays into column of floats with Python polars

I've started using polars recently (https://pola-rs.github.io/polars/py-polars/html/reference/index.html) I have a column in my data frame that contains single element arrays (output of a keras…
user555265
  • 493
  • 2
  • 7
  • 18
0
votes
1 answer

Memory-efficient row-wise shuffle Polars

A simple row-wise shuffle in Polars with df = df.sample(frac=1.0) has a peak memory usage of 2x the size of the dataframe (profiling with mprof). Is there any fast way to perform a row-wise shuffle in Polars while keeping the memory usage down as…
Danny Friar
  • 383
  • 4
  • 17
0
votes
1 answer

how to replace pandas df.rank(axis=1) with polars

Alpha factors need section rank sometimes, like this: import pandas as pd df = pd.Dataframe(some_data) df.rank(axis=1, pct=True) how to implement this with polars efficiently?
0
votes
2 answers

How to wrap ta-lib function as a Polars expression

I am trying to call some TA-lib(https://github.com/mrjbq7/ta-lib) functions through Polars so that the multiple stocks' technical indicators could possibly be calculated through Polars' parallel computing framework. Here is the sample code import…
Peascod
  • 11
  • 2
0
votes
2 answers

Filling `null` values of a column with another column

I want to fill the null values of a column with the content of another column of the same row in a lazy data frame in Polars. Is this possible with reasonable performance?
zareami10
  • 111
  • 7
0
votes
1 answer

Pandas: Rolling Mean and ignore NaN

How does you tell pandas to ignore NaN values when calculating a mean? With min periods, pandas will return NaN for a number of min_periods when it encounters a single NaN. Example: pd.DataFrame({ 'x': [np.nan, 0, 1, 2, 3, np.nan, 5, 6, 7, 8,…
Test
  • 962
  • 9
  • 26
0
votes
2 answers

thread '' panicked at assertion

I received an unknown error in Python Polars: thread '' panicked at 'assertion failed: `(left == right)` left: `Float64[NaN, 1, NaN, NaN, NaN, ...[clip]... right: `Float64[NaN, 1, NaN, NaN, NaN, ...[clip]... Is this an internal…
Test
  • 962
  • 9
  • 26
0
votes
1 answer

Using a reduction ufunc in agg

How do I use a ufunc that reduces to a scalar in the context of aggregation? For example, summarizing a table using numpy.trapz: import polars as pl import numpy as np df = pl.DataFrame(dict(id=[0, 0, 0, 1, 1, 1], t=[2, 4, 5, 10, 11, 14], y=[0, 1,…
drhagen
  • 8,331
  • 8
  • 53
  • 82
0
votes
1 answer

Handle nan values in rolling operations

I'm testing rolling operation and I have the following problem: import polars as pl import numpy as np df = pl.DataFrame( { "values": [np.nan, 1, 1, 2, 4, 5, 3] } ) df = df.select( [ pl.all(), …
Sigi
  • 53
  • 8
0
votes
2 answers

Excel equivalent average if on moving window

I'm learning polars (as substitute of pandas) and I would reply some excel functions. In particular average if over a rolling windows. Let us suppose we have a column with positive and negative value, how can I create a new column with rolling…
Sigi
  • 53
  • 8
0
votes
3 answers

Number of rows within each window

I have some expressions that I will evaluate later either within or without a window function. This normally works fine. Have pl.col("x").max()—add .over("y") later. Have pl.arange(0, pl.count())—add .over("y") later. One expression this does not…
drhagen
  • 8,331
  • 8
  • 53
  • 82