Questions tagged [python-polars]

Polars is a DataFrame library/in-memory query engine.

The Polars core library is written in Rust and uses Arrow, the native arrow2 Rust implementation, as its foundation. It offers Python and JavaScript bindings, which serve as a wrapper for functionality implemented in the core library.

Links

1331 questions
3
votes
1 answer

Does Polars module not have a method for appending DataFrames to output files?

When writing a DataFrame to a csv file, I would like to append to the file, instead of overwriting it. While pandas DataFrame has the .to_csv() method with the mode parameter available, thus allowing to append the DataFrame to a file, None of the…
3
votes
3 answers

How to use polars cut method returning result to original df

How can I use it in select context, such as df.with_columns? To be more specific, if I have a polars dataframe with a lot of columns and one of them is called x, how can I do pl.cut on x and append the grouping result into the original…
lebesgue
  • 837
  • 4
  • 13
3
votes
1 answer

Implement qcut functionality using polars

I have been using polars but it seems like it lacks qcut functionality as pandas do. I am not sure about the reason but is it possible to achieve the same effect as pandas qcut using current available polars functionalities? The following shows an…
lebesgue
  • 837
  • 4
  • 13
3
votes
3 answers

Creating a date range in python-polars with the last days of the months?

How do I create a date range in Polars (Python API) with only the last days of the months? This is the code I have: pl.date_range(datetime(2022,5,5), datetime(2022,8,10), "1mo", name="dtrange") The result is: '2022-05-05', '2022-06-05',…
Luca
  • 1,216
  • 6
  • 10
3
votes
2 answers

how to handle timestamps from summer and winter when converting strings in polars

I'm trying to convert string timestamps to polars datetime from the timestamps my camera puts in it RAW file metadata, but polars throws this error when I have timestamps from both summer time and winter time. ComputeError: Different timezones found…
pootle
  • 507
  • 6
  • 15
3
votes
1 answer

How to append data to existing Parquet from Polars

I have multiple polars dataframes and I want to append them to an existing Parquet file. df.write_parquet("path.parquet") overwrites the existing parquet file. How can I append?
Jahspear
  • 151
  • 11
3
votes
3 answers

Polars - concatenate a variable number of columns for each row based off another column

Suppose I have a simple dataframe as manually generated by the code below: cols=['a','b','c'] values=['d','e','f'] df=(pl.DataFrame({cols[i]:[values[i]]*3 for i in range(len(cols))}) .with_columns(pl.lit(pl.Series(['a,b','b,c','a,c'])) …
sjs
  • 53
  • 3
3
votes
1 answer

How to perform pandas reindex in polars

In pandas, I can reindex() the dataframe using multi-index to make the date range consistent for each group. Is there any way to produce the same result in polars? See example below using pandas: import pandas as pd data = pd.DataFrame({ …
3
votes
2 answers

Polars counting elements in list column

I've have dataframe with column b with list elements, I need to create column c that counts number elements in list for every row. Here is toy example in Pandas: import pandas as pd df = pd.DataFrame({'a': [1,2,3], 'b':[[1,2,3], [2], [5,0]]}) …
Quant Christo
  • 1,275
  • 9
  • 23
3
votes
1 answer

Apply a function to 2 columns in Polars

I want to apply a custom function which takes 2 columns and outputs a value based on those (row-based) In Pandas there is a syntax to apply a function based on values in multiple columns df['col_3'] = df.apply(lambda x: func(x.col_1, x.col_2),…
Maiia Bocharova
  • 149
  • 1
  • 7
3
votes
1 answer

How to convert Date to timezone aware datetime in polars

Let's say I have df = pl.DataFrame({ "date": pl.Series(["2022-01-01", "2022-01-02"]).str.strptime(pl.Date), "%Y-%m-%d") }) How do I localize that to a specific timezone and make it a datetime? I…
Dean MacGregor
  • 11,847
  • 9
  • 34
  • 72
3
votes
3 answers

(Polars) How to get element from a column with list by index specified in another column

I have a dataframe with 2 columns, where first column contains lists, and second column integer indexes. How to get elements from first column by index specified in second column? Or even better, put that element in 3rd column. So for example, how…
Kaster
  • 357
  • 4
  • 16
3
votes
0 answers

Polars join on array items without explode/groupby

a follow up from Polars lazyframe - add fields from other lazyframe as struct without a `collect`. I now want to join on array items. Currently the only way i know of doing this would be to first explode the array, perform the join, do a groupby,…
Cory Grinstead
  • 511
  • 3
  • 16
3
votes
1 answer

Polars lazyframe - add fields from other lazyframe as struct without a `collect`

I am trying to populate a new field containing a struct of all of the other fields from another lazyframe based on a predicate. While the examples are in python, I am open to answers in python or rust. companies = pl.DataFrame({ "id": [1], …
Cory Grinstead
  • 511
  • 3
  • 16
3
votes
1 answer

How can I add a column of empty arrays to polars.DataFrame?

I am trying to add a column of empty lists to a polars dataframe in python. My code import polars as pl a = pl.DataFrame({'a': [1, 2, 3]}) a.with_columns([pl.lit([]).alias('b')]) throws Traceback (most recent call last): File "", line 1,…
Dimitrius
  • 564
  • 6
  • 21