Questions tagged [python-polars]

Polars is a DataFrame library/in-memory query engine.

The Polars core library is written in Rust and uses Arrow, the native arrow2 Rust implementation, as its foundation. It offers Python and JavaScript bindings, which serve as a wrapper for functionality implemented in the core library.

Links

1331 questions
0
votes
2 answers

How to replicate df.groupby('some_column').resample('Q').agg('total':'count') in polars with groupby_dynamic

Given i want to group quarterly, then in pandas i would write (given that the index contains the 'date' column) df = df.groupby('some_column').resample('Q').agg({"total": "count"}) I tried to replicate that with polars: df =…
supersick
  • 261
  • 2
  • 14
0
votes
1 answer

convert a pandas loc operation that needed the index to assign values to polars

In this example i have three columns, the 'DayOfWeek' Time' and the 'Risk'. I want to group by 'DayOfWeek' and take the first element only and assign a high risk on it. This means the first known hour in day of week is the one that has the highest…
zacko
  • 179
  • 2
  • 9
0
votes
1 answer

PyPolars: Speed up apply function to find common elements

I am trying to find common elements in a column of list wrt a reference cell. I could accomplish it with a small dataset but I face two problems. The speed is excruciatingly slow even for 25 rows of sample data (20.7 s ± 52 ms per loop), and I…
Quiescent
  • 1,088
  • 7
  • 18
0
votes
1 answer

Failing to understand example in documentation about window functions (operations per group) in polars

In the example under the section Operations per group the author writes: col("value").sort().over("group") But he doesn't say what value or group he picked. The assumption is that in this example he selected as value the 'speed' column and groups…
zacko
  • 179
  • 2
  • 9
0
votes
1 answer

In polars, nested when-then-otherwise gives unexpected behavior in groupby / window context

I have a custom expression to wrap around the "rank" expression to ignore nulls. def rank(_exp,method='average',reverse=False): #Fill nans so as not to affect ranking fill = -np.Inf if reverse else np.Inf tmp =…
lowmotion
  • 1
  • 1
0
votes
1 answer

How to properly set-up Graphviz for Polars on Mac or Windows?

Polars has a powerful feature called df.show_graph(optimized=True). Have been trying to get this installed on my Windows and Mac computer, system info is below. Windows OS = Windows 10 Architecture = x86_64 MacBook OS = Monterey Arch= arm64 (Apple…
Jenobi
  • 368
  • 4
  • 12
0
votes
1 answer

How to form dynamic expressions without breaking on types

Any way to make the dynamic polars expressions not break with errors? Currently I'm just excluding the columns by type, but just wondering if there is a better way. For example, i have a df coming from parquet, if i just execute an expression on all…
Chitral Verma
  • 2,695
  • 1
  • 17
  • 29
0
votes
1 answer

Fill columns independently

I have a python class with two data class, first one is a polars time series, second one a list of string. In a dictionary, a mapping from string and function is provided, for each element of the string is associated a function that returns a polars…
Sigi
  • 53
  • 8
0
votes
1 answer

Polars meaning of parallelization?

I'd like to use this package as data backend to expose an api/website with data analysis How parallelization is done in this package ? is it possible to control the resources consumed ? Br
Devyl
  • 565
  • 3
  • 8
0
votes
1 answer

How to match text efficiently between two DataFrames

I have some text data: data1 id comment title user_A good a file name user_B a better way is… is there some good sugg? user_C a another way is… is there some good sugg? user_C I have been using Pandas for a long time, so I… a…
lemmingxuan
  • 549
  • 1
  • 7
  • 18
0
votes
1 answer

expanding.apply in polars

In pandas I could call data.expanding(min_periods=1).apply(lambda_func) to call a func on expanding or a cumsum-like view. How to do the same thing with polars? I could only find rolling_apply or apply.
Hakase
  • 211
  • 1
  • 12
0
votes
1 answer

String as a condition in a filter

In my program, I want the user to be able to pass a string as a condition. For example, if the user input is "col(X) | col(Y)", I would like this string to be the filter condition in the filter function of a Dataframe. So an example will be like…
Zorp
  • 75
  • 5
0
votes
1 answer

How to get the groupby keys with a loop?

I need to do some somewhat complicated processing for each group after grouping. in pandas, it can be writed as follows: for i,g in df.groupby(['id','sid']): pass While in polars, the groups function returns a DataFrame, But this cannot be…
lemmingxuan
  • 549
  • 1
  • 7
  • 18
0
votes
1 answer

Polars: Setting categorical column to a specific value while keeping categorical type

Can somebody help me with the preferred way to set a categorical value for some rows of a polars data frame (based on a condition)? Right now I came up with a solution that works by splitting the original data frame in two parts (condition==True and…
datenzauber.ai
  • 379
  • 2
  • 11
0
votes
1 answer

Polars: Joining on categorical column after aggregation

I understand that when I create categorical columns in different data frames they won't join/stack when not created under the same global string cache. However, when deriving a new data frame by aggregating from an existing one, shouldn't it be…
datenzauber.ai
  • 379
  • 2
  • 11