Questions tagged [python-polars]

Polars is a DataFrame library/in-memory query engine.

The Polars core library is written in Rust and uses Arrow, the native arrow2 Rust implementation, as its foundation. It offers Python and JavaScript bindings, which serve as a wrapper for functionality implemented in the core library.

Links

1331 questions
2
votes
1 answer

How can I connect to SQL Server in Python using Polars or ConnectorX without receiving a runtime error?

I am trying to connect to SQL Server using either polars or connectorx and I always get a RuntimeError: Timed out in bb8 My code is along these lines import pandas as pd import polars as pl import connectorx as…
2
votes
1 answer

polars: list to columns, without `get`

Say I have: In [1]: df = pl.DataFrame({'a': [[1,2], [3,4]]}) In [2]: df Out[2]: shape: (2, 1) ┌───────────┐ │ a │ │ --- │ │ list[i64] │ ╞═══════════╡ │ [1, 2] │ │ [3, 4] │ └───────────┘ I know that all elements of 'a' are lists…
ignoring_gravity
  • 6,677
  • 4
  • 32
  • 65
2
votes
2 answers

Compare polars list to python list

Say I have this: df = polars.DataFrame(dict( j=[1,2,3], k=[4,5,6], l=[7,8,9], )) shape: (3, 3) ┌─────┬─────┬─────┐ │ j ┆ k ┆ l │ │ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ i64 │ ╞═════╪═════╪═════╡ │ 1 ┆ 4 ┆ 7 │ │ 2 ┆ 5 ┆ 8 │ │ 3 …
levant pied
  • 3,886
  • 5
  • 37
  • 56
2
votes
3 answers

How can I join two dataframes on a key with List[f64] dtype

I have the following two dataframes, and I would like to join them on a key that is a List[pl.Float64] dtype. Is there any way I can achieve the following desired result? df = pl.DataFrame({"id": [1, 2.0], "coords": [(1.0, 2.0), (3.0,…
TomNorway
  • 2,584
  • 1
  • 19
  • 26
2
votes
2 answers

How to add multiple DataFrames with different shapes in polars?

I would like to add multiple DataFrames with different shapes together. Before adding the DataFrames, the idea would be to reshape them by adding the missing rows (using an "index" column as the reference) and the missing columns (filled with…
thoera
  • 23
  • 3
2
votes
2 answers

Recursively lookup value with polars?

I want to be able to get the manager_id of each manager recursively using a polars DF which has two columns: "employee_id", "manager_1_id" In pandas, this code was: id_index = df.set_index("employee_id")["manager_1_id"] for i in range(1, 12): …
ldacey
  • 518
  • 8
  • 16
2
votes
1 answer

How to find the no. of nulls in every column in a polars dataframe?

In pandas, one can do: import pandas as pd d = {"foo":[1,2,3, None], "bar":[4,None, None, 6]} df_pandas = pd.DataFrame.from_dict(d) dict(df_pandas.isnull().sum()) [out]: {'foo': 1, 'bar': 2} In polars it's possible to do the same by looping…
alvas
  • 115,346
  • 109
  • 446
  • 738
2
votes
1 answer

In python polars filter and aggregate dict of lists

I have got a dataframe with string representation of json: df = pl.DataFrame({ "json": [ '{"x":[0,1,2,3], "y":[10,20,30,40]}', '{"x":[0,1,2,3], "y":[10,20,30,40]}', '{"x":[0,1,2,3], "y":[10,20,30,40]}' …
2
votes
1 answer

What is the fastest way to do "indexed" look-ups in Polars?

I am working with large polars dataframes which are fully loaded in memory. Each row is uniquely indexed by columns entityId (Int64) and entryDate (date). I know poalars does not have indexes, but I still need to do ad-hoc look-ups of data against…
MYK
  • 1,988
  • 7
  • 30
2
votes
2 answers

Updating non-trivial structures in polars cells

Say I have this: >>> polars.DataFrame([[(1,2),(3,4)],[(5,6),(7,8)]], list('ab')) shape: (2, 2) ┌────────┬────────┐ │ a ┆ b │ │ --- ┆ --- │ │ object ┆ object │ ╞════════╪════════╡ │ (1, 2) ┆ (5, 6) │ │ (3, 4) ┆ (7, 8)…
levant pied
  • 3,886
  • 5
  • 37
  • 56
2
votes
3 answers

polars equivalent of pandas groupby.apply(drop_duplicates)

I am new to polars and I wonder what is the equivalent of pandas groupby.apply(drop_duplicates) in polars. Here is the code snippet I need to translate : import pandas as pd GROUP = list('123231232121212321') OPERATION =…
2
votes
4 answers

How to apply value_counts() to multiple columns in polars python?

I am trying to apply a simple value_counts() to multiple columns on a dataframe in polars but getting error. import polars as pl import pandas as pd data: sample_df = pl.DataFrame({'sub-category': ['tv','mobile','tv','wm','micro','wm'], …
ViSa
  • 1,563
  • 8
  • 30
2
votes
2 answers

How to replace certain values in a Polars Series?

I want to replace the inf values in a polars series with 0. I am using the polars Python library. This is my example code: import polars as pl example = pl.Series([1,2,float('inf'),4]) This is my desired output: output =…
Sandwichnick
  • 1,379
  • 6
  • 13
2
votes
2 answers

How to create a series for a list of dict in polars?

I'm trying switch my code from pandas to polars. there's a list of dict as below: data = [{ "MA5": 91.128, "MA10": 95.559, "MA20": 103.107, "MA30": 109.3803, "MA60": 114.0822 }, { "MA5": 13.776, "MA10": 14.027, …
letit
  • 23
  • 2
2
votes
1 answer

Read SQL from AWS Athena with Polars

I want to read from AWS Athena with polars. Is this possible? Before I used pandas: import pandas as pd pd.read_sql(SQL_STATMENT, conn) I found this User Guide: https://pola-rs.github.io/polars-book/user-guide/howcani/io/read_db.html where Athena…