Questions tagged [rust-polars]

271 questions
2
votes
0 answers

Is there a way to use a specific global string cache after running reset_string_cache() in rust polars

Running below code we get two global string cache as I have used reset_string_cache() before creating out2. Is it possible to create another categorical variable with same global string cache as used in out1 after using reset_string_cache()? use…
Kushdesh
  • 1,118
  • 10
  • 16
2
votes
1 answer

LazyFrame: How to do string manipulation on values in a single column

I want to change all string values in a LazyFrame-Column. e.g. from "alles ok" ==> to "ALLES OK" I see that a series has a function to do it: polars.internals.series.StringNameSpace.to_uppercase Q: What is the proper way to apply a string (or Date)…
Robert
  • 131
  • 1
  • 7
2
votes
1 answer

POLARS Dataframe innerJOIN in RUST

RUST / POLARS nooby question :) I can not get the "inner_join" to work: use polars::prelude::*; use std::fs::File; use std::path::PathBuf; use std::env; fn main() -> std::io::Result<()> { let mut root = env::current_dir().unwrap(); let…
Robert
  • 131
  • 1
  • 7
2
votes
1 answer

How to create a column with the lengths of strings from a different column in Polars Rust?

I'm trying to replicate one of the Polars Python examples in Rust but seem to have hit a wall. In the Python docs there is an example which creates a new column with the lengths of the strings from another column. So for example, column B will…
Stefan
  • 61
  • 1
  • 1
  • 9
2
votes
0 answers

What is an efficient way of loop through rows of a polars LazyFrame?

I have a large Parquet file with just a few columns, and I'd like apply a function to each row of the Paruqet file. What is an efficient way of doing that? By efficient, I mean 2 things. The code runs fast. The code doesn't load the while file into…
Benjamin Du
  • 1,391
  • 1
  • 17
  • 25
2
votes
3 answers

How to apply a function to multiple columns of a polars DataFrame in Rust

I'd like to apply a user-define function which takes a few inputs (corresponding some columns in a polars DataFrame) to some columns of a polars DataFrame in Rust. The pattern that I'm using is as below. I wonder is this the best practice? fn…
Benjamin Du
  • 1,391
  • 1
  • 17
  • 25
2
votes
2 answers

window agg over one value, but return another via Polars

I am trying to use polars to do a window aggregate over one value, but map it back to another. For example, if i wanted to get the name of the max value in a group, instead of (or in combination to) just the max value. assuming an input of something…
Cory Grinstead
  • 511
  • 3
  • 16
2
votes
1 answer

How to loc value in Rust Polars?

In Pandas or Polars-Python, we can loc a value by using iloc loc or [1,2]. How could we do the same thing in Polars with Rust?
Hakase
  • 211
  • 1
  • 12
2
votes
1 answer

How to get a Vec from polars Series or ChunkedArray?

In Rust Polars, how to cast a Series or ChunkedArray to a Vec?
Hakase
  • 211
  • 1
  • 12
2
votes
1 answer

How to the get the first n% of a group in polars?

Q1: In polars-rust, when you do .groupby().agg() , we can use .head(10) to get the first 10 elements in a column. But if the groups have different lengths and I need to get first 20% elements in each group (like 0-24 elements in a 120 elements…
Hakase
  • 211
  • 1
  • 12
2
votes
2 answers

Is there a way to apply a UDF function returning multiple values in Rust polars?

I'm trying to use polars to apply a function from another library across each row of an input. I can't find any examples or tests of using an Expr to apply a function, even when it has one return value; so I'm lost. It's taking an input dataframe…
2
votes
1 answer

In polars, can I create a categorical type with levels myself?

In Pandas, I can specify the levels of a Categorical type myself: MyCat = pd.CategoricalDtype(categories=['A','B','C'], ordered=True) my_data = pd.Series(['A','A','B'], dtype=MyCat) This means that I can make sure that different columns and sets…
Jarrad
  • 927
  • 8
  • 19
2
votes
2 answers

How to define types of columns while loading dataframe in polars?

I'm using polars and I would like to define the type of the columns while loading a dataframe. In pandas, I can use dtype: df=pd.read_csv("iris.csv", dtype={'petal_length':str}) I'm trying to do the same thing in polars, but without success until…
Lucas
  • 1,166
  • 2
  • 14
  • 34
1
vote
1 answer

Polars: Date formatting with 'custom' nanosecond precision

I am new to polars, and i was trying to convert a flat file datetime type column to a dataframe column with 1 nanosecond precision. For example: 1974-08-08 16:28:51 => 1974-08-08 16:28:51.0 1997-12-12 17:56:19 => 1997-12-12 17:56:19.0 Flat…
1
vote
0 answers

Polars - avoid repeatedly doing a groupby_dynamic for similar operations? How to rewrite the Exprs?

I have a large dataframe that I am doing a groupby_dynamic operation on, and then doing an aggregation that combined the columns in each group, down to a single boolean value. Ultimately, I have to do this operation many times, as it is part of a…
Andrew P.
  • 31
  • 4