Questions tagged [rust-polars]

271 questions
1
vote
1 answer

With polars dataframe, is it possible to serialize/deserialize only the transformations and not the original dataframe?

I'd like to be able to serialize the transformations, like group by or so, and then apply them to an existing dataframe. I'd rather not have to build the LogicalPlan recursively, as this would mean having to manually implement and keep up to date…
julienfr112
  • 2,077
  • 2
  • 25
  • 37
1
vote
0 answers

How to create a dataframe from various expressions in polars

I am working with polars and would like to create a Dataframe from various expression outputs such as mean and median and std etc. let series_a = Series::new("ID", vec![1, 2, 3, 4]); let series_b = Series::new("Amount", vec![10.0, 22.0,…
Kival M
  • 182
  • 1
  • 10
1
vote
1 answer

Rust polars: Create DataFrame and groupby and Aggregate on it inside apply_multiple() (ie inside another groupby context)?

I have a complex computation logic, which requires me to create a new Dataframe from inputs to the apply_multiple in order to leverage on DataFrame functionality such as filter, Groupby, Aggregate etc for that new DF (inside wider Groupby Aggregate…
Anatoly Bugakov
  • 772
  • 1
  • 7
  • 18
1
vote
2 answers

How to add Column names in a Polars DataFrame while using CsvReader

I can read a csv file which does not have column headers in the file. With the following code using polars in rust: use polars::prelude::*; fn read_wine_data() -> Result { let file = "datastore/wine.data"; …
DataPsycho
  • 958
  • 1
  • 8
  • 28
1
vote
2 answers

Can not read csv into Polars dataframe in Rust with LazyCsvReader

I was trying rust version of polars for the first time. So I have set up a project and added polars into the cargo.toml file the cargo file looks as follows: [package] name = "polar_test" version = "0.1.0" edition = "2021" # See more keys and their…
DataPsycho
  • 958
  • 1
  • 8
  • 28
1
vote
0 answers

Load dataframe from json given headers and rowSet

I am trying to use the polars rust library to create dataframes from json fetched from stats.nba.com, (example json). The best example I could find for creating a dataframe from json was from the docs but I'm not sure how to load a serde_json::Value…
user1775500
  • 2,263
  • 3
  • 18
  • 17
1
vote
1 answer

Rust Polars .par_iter() for ChunkedArray?

In Rust, using Polars, I am writing a custom function to be used within apply/map. This works well: fn capita(x: Series) -> Result { let y = x .utf8() .unwrap() .par_iter() //ParallelIterator However,…
Anatoly Bugakov
  • 772
  • 1
  • 7
  • 18
1
vote
0 answers

Can any one provide example related to polars as_struct().apply()

I try to add array column to the existing data-frame. input is something like this | 1 | 3 | | 2 | 4 | and output is | 1 | 3 | [?, ?, ?, ?] | | 2 | 4 | [?, ?, ?, ?] | value of the array will be populated by some custom…
1
vote
1 answer

Is it possible to load Parquet data directly from memory?

I have a use case where I will be downloading Parquet data directly into memory (not into the filesystem). Is it possible to load these files as (lazy) dataframes from a Vec? instead of passing in the path?
user655321
  • 1,572
  • 2
  • 16
  • 33
1
vote
0 answers

why the Polars groupby.agg in rust is slower than python version?

In Polars-python, I can do this lazy action, it cost about 17ms, and almost the same time cost on a eager version. the data has 100000 rows. data sample: code date open close change_predict factor factor_cta A …
Hakase
  • 211
  • 1
  • 12
1
vote
0 answers

how to use filter in polars

In polars-py, I can do this: data.groupby('date').agg([ pl.col("name").sort_by('factor').head(5).filter(pl.col("x")==1).alias('small'), pl.col("name").sort_by('factor').tail(5).filter(pl.col("x")==0).alias('big'), ]).sort('date') But…
Hakase
  • 211
  • 1
  • 12
1
vote
2 answers

How to get row_count for a group in polars?

The usage might seems like the code below out_df = df.select([ pl.col("*"), pl.col("md5").row_count().over("md5").alias("row_count"), ]) print(out_df) The data should be like this: before: md5 a a b after: md5 row_count a 1 a 2 b 1
侯颖堃
  • 29
  • 1
  • 4
1
vote
1 answer

How to read compressed TSV files (*.gtf.gz) with rust-polars?

Complete rust beginner here coming from python. I would like to use rust-polars to read a compressed GTF (*.gtf.gz) file: let schema = Arc::new(Schema::new(vec![ Field::new("contigName", DataType::Categorical), …
Hoeze
  • 636
  • 5
  • 20
1
vote
1 answer

`AsArray` cannot be made into an object when implementing a trait for a trait

Basically I'm trying to make a trait that indicates the ability to be converted into a 2D ndarray aka ndarray::Array2: trait Into2DArray{ fn to_array(&self) -> Array2; } I would like to do this by expanding the existing AsArray trait, but…
Migwell
  • 18,631
  • 21
  • 91
  • 160
1
vote
1 answer

How do I use `ndarray_stats::CorrelationExt` on a `polars::prelude::DataFrame`?

I'm trying to calculate the covariance of a data frame in Rust. The ndarray_stats crate defines such a function for arrays, and I can produce an array from a DataFrame using to_ndarray. The compiler is happy if I use the example in the documentation…
Migwell
  • 18,631
  • 21
  • 91
  • 160