Questions tagged [python-polars]

Polars is a DataFrame library/in-memory query engine.

The Polars core library is written in Rust and uses Arrow, the native arrow2 Rust implementation, as its foundation. It offers Python and JavaScript bindings, which serve as a wrapper for functionality implemented in the core library.

Links

1331 questions
3
votes
1 answer

Polars: return dataframe with all unique values of N columns

I have a dataframe that has many rows per combination of the 'PROGRAM', 'VERSION' and 'RELEASE_DATE' columns. I want to get a dataframe with all of the combinations of just those three columns. Would this be a job for groupby or distinct? thx
rchitect-of-info
  • 1,150
  • 1
  • 11
  • 23
3
votes
1 answer

how to limit number of threads in polars

Is there a way to limit the number of threads used by polars? I am doing this because I am doing a second layer of parallelization around some polars code, and would like to limit the inner parallelism. This should still be better than Pandas due to…
bumpbump
  • 542
  • 4
  • 17
3
votes
1 answer

Filter DataFrame using within-group expression

Assuming I already have a predicate expression, how do I filter with that predicate, but apply it only within groups? For example, the predicate might be to keep all rows equal to the maximum or within a group. (There could be multiple rows kept in…
drhagen
  • 8,331
  • 8
  • 53
  • 82
3
votes
1 answer

Python - Polars - value counts on string column

How to apply word count on Polars DataFrame I have a string column and I want to make a word count on all text. Thanks DataFrame example: 0 Would never order again. 1 I'm not sure it gives me any type of glow and ... 2…
MPA
  • 1,011
  • 7
  • 22
3
votes
1 answer

Parsing using Polars

I am trying to load data into a polars DataFrame using the read_csv command but I keep getting this error RuntimeError: Any(ComputeError("Could not parse 0.5 as dtype Int64 at column 13.\n The total offset…
Rayen
  • 31
  • 1
  • 2
3
votes
1 answer

How to assign Exponential Moving Averages after groupby in python polars

I have just started using polars in python and I'm coming from pandas. I would like to know how can I replicate the below pandas code in python polars import pandas as pd import polars as pl df['exp_mov_avg_col'] =…
ashuk
  • 31
  • 2
3
votes
2 answers

Combine multiple datetime string columns into one column in Polars

I have the following Python Code with pandas df['EVENT_DATE'] = df.apply( lambda row: datetime.date(year=row.iyear, month=row.imonth, day=row.iday).strftime("%Y-%m-%d"), axis=1) and want to transform it into a valid Polars Code. Does anyone…
seb2704
  • 390
  • 1
  • 5
  • 17
3
votes
1 answer

Python Polars column based update does not work

I use Polars library for dataframe manipulations. I have two dataframes, I want to update column values of one dataframe with single value which is got from another dataframe based on a condition. This is the code: tmp = df[df['UnifiedInvoiceID'] ==…
irohamca
  • 497
  • 3
  • 19
2
votes
0 answers

Polars 'POLARS_MAX_THREADS' doesn't actually work

Polars 'POLARS_MAX_THREADS' doesn't actually work, although it does create POLARS_MAX_THREADS number of threads at beginning, but after i do some calculation, the number of threads it uses boost a lot. import os os.environ['POLARS_MAX_THREADS'] =…
Rolnan
  • 23
  • 5
2
votes
2 answers

Python Polars: Number of rows since last value >0

Given a polars DataFrame column like [0, 29, 28, 4, 0, 0, 13, 0] how to get a new column like [1, 0, 0, 0, 1, 2, 0, 1] The solution should preferably work with .over() for grouped values and optionally an additional rolling window function like…
OliverHennhoefer
  • 677
  • 2
  • 8
  • 21
2
votes
1 answer

How to apply ip lookup using polars?

Given two tables I'd like to conduct a lookup over all ips and find the network it belongs to: I have two large tables: and the following networks: Regarding the ClientIP (First table) I thought of casting the whole column with…
JammingThebBits
  • 732
  • 11
  • 31
2
votes
1 answer

How to get the days between today and a polars date?

I'm having a bit of trouble with my python code. I originally wrote it using pandas, but I need something a bit faster, so I'm converting it to polars. After reading the mongodb into polars dataframes with race =…
slydexic
  • 23
  • 4
2
votes
3 answers

Create a new column based partially on other column names

I am new to both Polars and Python in general. I have a somewhat unusual problem I could use some help with. I have a dataframe with 50 plus columns that are 0/1. I need to create a new column that contains comma separated list of each column that…
craigm
  • 137
  • 6
2
votes
1 answer

Remove non-ASCII characters from a Polars Dataframe

I have a Polars Dataframe with a mix of Series, which I want to write to a CSV / Upload to a Database. The problem is if any of the UTF8 series have non-ASCII characters, it is failing due to the DB Type I'm using so I would like to filter out the…
Cade
  • 58
  • 5
2
votes
1 answer

"Vlookup" on Dataframes with multiple conditions in Python (Polars)

I would like to compare two excel-tables and add data from one to the other. Specifically, I have a table for "sorting information", which is structured as…