Questions tagged [python-polars]

Polars is a DataFrame library/in-memory query engine.

The Polars core library is written in Rust and uses Arrow, the native arrow2 Rust implementation, as its foundation. It offers Python and JavaScript bindings, which serve as a wrapper for functionality implemented in the core library.

Links

1331 questions
0
votes
1 answer

How to spilt a big DataFrame into Vec by group in Polars

I stored some financial market data in a Polars DataFrame. As for analysis, it is is fast to run some groupby("date").agg() action. But in a realtime scenario , the new data is coming time by time, I don't want to concat the new data with old data…
Hakase
  • 211
  • 1
  • 12
0
votes
1 answer

how to calculate pct_change by polars?

Now I have a dataframe like this: df = pd.DataFrame({"asset":["a","b","c","a","b","c","b","c"],"v":[1,2,3,4,5,6,7,8],"date":["2017","2011","2012","2013","2014","2015","2016","2010"]}) I can calculate the pct_change by groupby and my function like…
0
votes
1 answer

polars: multi-threaded computing of common elements between two columns

I have a huge dataset (~100M rows). As Pandas does not support multi-threading, I am trying to use polars library for analysis. The minimal problem I am trying to solve is shown below: >>> import polars as pl >>> df = pl.DataFrame({"col1" : ["abc",…
xinit
  • 147
  • 9
0
votes
1 answer

Groupby count values as columns

I want to groupby some columns, but one of the columns is either 'Fire' or 'Water'. I need to count the occurences of 'Fire' and 'Water' in seperate columns and also have a 'total' column which counts the sum of 'Fire' and 'Water' Example: df =…
supersick
  • 261
  • 2
  • 14
0
votes
1 answer

Polars: how to add years as literals?

I have a Polars LazyFrame that, after applying several functions, looks like this: ┌───────────────┬──────────────┬─────────────────────────┬──────────────────────────┐ │ citing_patent ┆ cited_patent ┆ cited_patent_issue_date ┆…
0
votes
1 answer

Overwriting inf in many columns

I have a dataframe with many columns that have occurances of inf. I'd like to replace these with null. All of the column names in question start with the string "ratio_". This is what I've tried, but I get new columns with the title "literal", when…
TomNorway
  • 2,584
  • 1
  • 19
  • 26
0
votes
1 answer

How to create a polars dataframe on the basis of previous row

I'm trying to create a polar data frame in python. Dataframe format is: timestamp(secs) Counter 164323232 2 I'm given only the first row. Now I need to create a dummy dataframe (say 100 rows) on the basis of this first row. Each row…
0
votes
1 answer

Advice refactoring polars expr

I have a polars expr, and I cannot use a context 'cause my function has to return a polars expr. I've implemented a RSI indicator in polars: rsi_indicator = (100*pl.when(pl.col("close").pct_change() >= 0) \ …
Sigi
  • 53
  • 8
0
votes
1 answer

Lazy selecting rows in polars?

I've written a lazy data-processing function with polars to process a large parquet dataset. Is there a way I can select N rows from the parquet file and return a lazy dataset? I notice that both .fetch(N) and .head(N) return DataFrames, not…
TomNorway
  • 2,584
  • 1
  • 19
  • 26
0
votes
1 answer

Casting a column from hexadecimal string to uint64?

As part of the kaggle competition (https://www.kaggle.com/competitions/amex-default-prediction/overview), I'm trying to take advantage of a trick where they (other competitors sharing their solution) reduce the size of a column by interpreting a…
TomNorway
  • 2,584
  • 1
  • 19
  • 26
0
votes
1 answer

Upsampling a polars dataframe with groupby

I'm trying to upsample a Polars dataframe while grouping by a particular column. In the following example, I wish to group by 'fruit' and then upsample by…
NFern
  • 1,706
  • 17
  • 18
0
votes
1 answer

How to apply conditional logic to a column sorted by another column within each group

Given the data frame const df = new DataFrame({ group: ["a", "a", "a", "a", "a", "a", "b", "b", "b", "b", "b", "b"], ts: [ new Date("2022-06-08T00:00:01"), new Date("2022-06-08T00:00:06"), new Date("2022-06-08T00:00:11"), new…
0
votes
1 answer

Apply to a list of columns in Polars

In the following dataframe I would like to multiply var_3 and var_4 by negative 1. I can do so using the following method but I am wondering if it can be done by collecting them in a list (imagining that there may be many more than 4 columns in the…
mark0512
  • 37
  • 3
0
votes
1 answer

Is there a way to handle `empty csv` in python when using `read_csv` of polars

My question is pretty like this but I'm using polars. Environment: python 3.8, polars >=0.13.24 I have a CSV file to parse every 500ms, but it may be reset by another program. When it is reset via reopening it, polars will through…
Campbell He
  • 297
  • 2
  • 9
0
votes
1 answer

iterate through groupby like pandas with a tuple

So when i iterate through a pandas.groupby() what i get back is a tuple. This was important because i could do [x for x in df_pandas.sort('date').groupby('grouping_column')] and then sort this list of tuples based on x[0]. In pandas it's also…
supersick
  • 261
  • 2
  • 14