Questions tagged [python-polars]

Polars is a DataFrame library/in-memory query engine.

The Polars core library is written in Rust and uses Arrow, the native arrow2 Rust implementation, as its foundation. It offers Python and JavaScript bindings, which serve as a wrapper for functionality implemented in the core library.

Links

1331 questions
0
votes
3 answers

Python-Polars: How to filter categorical column with string list

I have a Polars dataframe like below: df_cat = pl.DataFrame( [ pl.Series("a_cat", ["c", "a", "b", "c", "b"], dtype=pl.Categorical), pl.Series("b_cat", ["F", "G", "E", "G", "G"], dtype=pl.Categorical) ]) print(df_cat) shape: (5,…
Peascod
  • 11
  • 2
0
votes
2 answers

Is there a way to cumulatively and distinctively expand list in polars

For distance, I want to accomplish conversion like below. ┌────────────┐ │ col │ │ --- │ │ list[str] │ ╞════════════╡ │ ["a"] │ ├╌╌╌╌╌╌╌╌╌╌╌╌┤ │ ["a", "b"] │ ├╌╌╌╌╌╌╌╌╌╌╌╌┤ │ ["c"] …
0
votes
1 answer

How to update the data frame column values from another data frame based a conditional match in polars?

I have two data frames as below. df_names = pl.DataFrame({'last_name':['Williams','Henry','XYX','Smith','David','Freeman','Walter','Test_A'], …
myamulla_ciencia
  • 1,282
  • 1
  • 8
  • 30
0
votes
1 answer

How can I do an outer join on two columns

So I have 2x Polars DF, both containing the columns AuctionId and RealmName. I want to find all the rows which does not exist in one another based on the AuctionId and RealmName combination. Hopefully the below answer can help :) import polars as…
Shamatix
  • 77
  • 1
  • 6
0
votes
3 answers

How to roll up duplicate observation in Python polars?

I have a data frame as- my_dt = pl.DataFrame({'last_name':['mallesh','bhavik','jagarini','mallesh','jagarini'], 'first_name':['yamulla','vemulla','yegurla','yamulla','yegurla'], …
myamulla_ciencia
  • 1,282
  • 1
  • 8
  • 30
0
votes
1 answer

Polars - Search results between two dataframes

So I am using Polars DF and got a bit stuck in regards to a task I am trying to do. So basically I have a df, lets call it auctiondf which can contain a lot of "auctions" I also have a trackingdf which contains items/auctions I want to track with…
Shamatix
  • 77
  • 1
  • 6
0
votes
0 answers

How to read higher size DB from MySQL using Polars and ConnectoX?

I'm new to Polars. I'm trying to get data from MySQL using polars for one of my project. here I have made a connection and trying to retrieve some sample data from a table. on executing this query it returns some of index data not the actual data…
myamulla_ciencia
  • 1,282
  • 1
  • 8
  • 30
0
votes
2 answers

How to apply frozenset on polars dataframe?

I have a pandas dataframe as: df_names = pd.DataFrame({'last_name':['Williams','Henry','XYX','Smith','David','Freeman','Walter','Test_A'], …
myamulla_ciencia
  • 1,282
  • 1
  • 8
  • 30
0
votes
1 answer

Polars - Perform matrix inner product on lazy frames to produce sparse representation of gram matrix

Suppose we have a polars dataframe like: df = pl.DataFrame({"a": [1, 2, 3], "b": [3, 4, 5]}).lazy() shape: (3, 2) ┌─────┬─────┐ │ a ┆ b │ │ --- ┆ --- │ │ i64 ┆ i64 │ ╞═════╪═════╡ │ 1 ┆ 3 │ ├╌╌╌╌╌┼╌╌╌╌╌┤ │ 2 ┆ 4 │ ├╌╌╌╌╌┼╌╌╌╌╌┤ │ 3 ┆…
OneRaynyDay
  • 3,658
  • 2
  • 23
  • 56
0
votes
1 answer

Is there setting to automatically cast a inf to nan option in polars?

We know in pandas, we can set pd.set_option('use_inf_as_na', True) to make sure inf is automatically casted to nan Is there a similar option in polars?
Leo Liu
  • 510
  • 4
  • 8
0
votes
0 answers

Polars - PanicException: Capacity overflow - help needed

What language are you using? Python Have you tried latest version of polars? Yes. What version of polars are you using? Polars 0.14.1 What operating system are you using polars on? Microsoft Windows 10 Pro, Version 10.0.19043 Build 19043 What…
bruppfab
  • 3
  • 2
0
votes
1 answer

Aggregate points into a grid using Polars

I have a points dataset in the following format (x, y, value), is it possible to get aggregated dataset using Polars native (maybe even lazy) code as much as possible? Basically I want to create a virtual grid and then sum all the points in…
Nezbeda
  • 131
  • 2
  • 12
0
votes
1 answer

How to get index corresponding to quantile in Polars List?

Suppose I have the following dataframe df = pl.DataFrame({'x':[[0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20]]}) To get the nth percentile, I can do the following: list_quantile_30 =…
Scout
  • 27
  • 5
0
votes
2 answers

Converting a polars Date column to only Months

I am trying to convert an existing column in my polars dataframe from Date to Month. The documentation here is not clear to me on how to call such methods. In pandas it looks like this to convert Date ->…
nosewitz
  • 1
  • 1
0
votes
2 answers

How to keep original datatime in pyarrow table?

The original datetime in a dict array is data= [ { eob:datetime.datetime(2022, 8, 5, 9, 35, tzinfo=tzfile('PRC')) }, { eob:datetime.datetime(2022, 8, 5, 9, 40, tzinfo=tzfile('PRC')) } ] table =…
colinshen
  • 1
  • 1