Questions tagged [python-polars]

Polars is a DataFrame library/in-memory query engine.

The Polars core library is written in Rust and uses Arrow, the native arrow2 Rust implementation, as its foundation. It offers Python and JavaScript bindings, which serve as a wrapper for functionality implemented in the core library.

Links

1331 questions
0
votes
1 answer

how to get the difference sets of two polars dataframes

with pandas pd.concat([df1, df2], axis=0).drop_duplicates(subset=['name1','name2'],keep=False) How to achieve the same function through polars?
ztsweet
  • 1
  • 2
0
votes
1 answer

How to set masked values within each group in groupby context using py-polars

Since rank does not handle null values, I want to write a rank function that can handle null values. import numpy as np import polars as pl df = pl.DataFrame({ 'group': ['a'] * 3 + ['b'] * 3, 'value': [2, 1, None, 4, 5, 6], }) df shape:…
Teamon
  • 3
  • 3
0
votes
1 answer

Count positive and negative values on the rows

I have a df as follows: with n knows at runtime. I need to count 1 and -1 values over the rows. Namely, I need a new df (or new columns in the old one): Any advice?
Sigi
  • 53
  • 8
0
votes
1 answer

How to apply function to multiple columns

I would like to replace with NaNs values that are more than 0.99 quantile and less than 0.01 quantile in the whole dataframe. For now I found a way of doing so with one column, so I can do it one-at-a-time, but maybe there is possibility to apply…
yk4r2
  • 1
  • 2
0
votes
1 answer

Is it possible to reference a different dataframe when using Polars expression without using Lambda?

Is there a way to reference another Polars Dataframe in Polars expressions without using lambdas? Just to use a simple example - suppose I have two dataframes: df_1 = pl.DataFrame( { "time": pl.date_range( low=date(2021, 1,…
Scout
  • 27
  • 5
0
votes
1 answer

Python Polars join on column with greater or equal

I have two polars dataframe, one dataframe df_1 with two columns start and end the other dataframe df_2 one with a column dates and I want to do a left join on df_2 under the condition that the dates column is in between the start and end column. To…
alexp
  • 697
  • 4
  • 9
0
votes
3 answers

Repeating a date in polars and exploding it

I have a polars dataframe with two date columns that represent a start and end date and then a value that I want to repeat for all dates in between those two dates so that I can join those on other tables. Example input…
slugmagug
  • 13
  • 5
0
votes
1 answer

Polars equivalent of Pandas "df.isnull().any(axis=1)"?

I am trying to replace Pandas with Polars in my code. I have one line code with pandas that returns a series where the ith element is a boolean that indicates if the ith row in df has missing values, see below df.isnull().any(axis=1) My current way…
jj2523
  • 1
  • 1
0
votes
2 answers

polars equivalent of pandas set_index() to_dict

I have a polars dataframe: import polars as pl df = pl.DataFrame({'index': [1,2,3,2,1], 'object': [1, 1, 1, 2, 2], 'period': [1, 2, 4, 4, 23], 'value': [24, 67, 89, 5, 23]}) How do I do the…
Michael WS
  • 2,450
  • 4
  • 24
  • 46
0
votes
1 answer

polars assign to multiple columns

I am trying to understand if there is any way to do when..then..otherwise in polars and assign to multiple columns. I have a elo dataset with millions of rows where I want to assign the current elo to anything greater than date. In pandas, I would…
Michael WS
  • 2,450
  • 4
  • 24
  • 46
0
votes
1 answer

Is it possible to maintain a persistent SQL connection to a postgres database?

With Pandas, we are able to create persistent connections, which allows (for example) creating temporary tables against which we can join. For example: import pandas as pd import sqlalchemy as sa engine =…
NedDasty
  • 192
  • 1
  • 8
0
votes
1 answer

Is there the same functions as Power Bi desktop measure defination using python-polars?

I want to use python-polars replace Power BI desktop tools. Data model can be a mulicolumns dataset include alll tabular model columns, but How Can I use dynamic measure definition in python-polars. For example: sum('amount') filter ('Country' =…
Hengaini
  • 44
  • 5
0
votes
2 answers

Is there a polars function to dot product two lists?

I have a dataframe which have list type columns, of equal lengths. I would like to do a dot product on these two columns without having to "explode" the lists (as it would take to much memory). The following code returns results but seems to only…
lowmotion
  • 1
  • 1
0
votes
2 answers

how to update the polars dataframe

I want to update a polars library dataframe, polars syntax/command which I used for the purpose: df[0, 'A'] = 'some value' but the above code gives an error: ValueError: cannot set with list/tuple as value; use a scalar value I am using polars…
Wajahat Raza
  • 1
  • 1
  • 2
0
votes
1 answer

How can I use "\s+" as a seperator in polars?

So I have a large file of data that's can be up to 11 columns wide, it looks something like this. 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 When I read in using pandas i used the code: pd.read_csv(file_dir, skiprows = 1, sep = '\s+'). When…
8mrsteel
  • 27
  • 6