0

I have a dataframe which has the column "ID" with data typed as UInt32 and I have a vector named ids. I want to return a dataframe with the rows which "ID" value is contained by the vector ids.

MINIMAL WANTED EXAMPLE

use polars::df;
use polars::prelude::*;

fn filter_by_id(table: &DataFrame, ids: Vec<u32>) -> DataFrame {
    df!{
        "ID" => &[1, 3, 5],
        "VALUE" => &["B", "D", "F"]
    }.unwrap()
}

fn main() {
    let table = df!{
        "ID" => &[0, 1, 2, 3, 4, 5],
        "VALUE" => &["A", "B", "C", "D", "E", "F"]
    }.unwrap();
    let ids = vec![1, 3, 5];
    let filtered_table = filter_by_id(&table, ids);
    println!("{:?}", table);
    println!("{:?}", filtered_table);
}
ID VALUE
0 A
1 B
2 C
3 D
4 E
5 F

filter vector = [1, 3, 5]

wanted output =

ID VALUE
1 B
3 D
5 F
sbb
  • 144
  • 8

1 Answers1

2

polars mostly operates on Series and Expr types. So by converting your vec to a Series you can accomplish this task relatively easy.


use polars::df;
use polars::prelude::*;

fn main () {
    let table = df!{
        "ID" => &[0, 1, 2, 3, 4, 5],
        "VALUE" => &["A", "B", "C", "D", "E", "F"]
    }.unwrap();
    let ids = vec![1, 3, 5];
    // convert the vec to `Series`
    let ids_series = Series::new("ID", ids);
    // create a filter expression
    let filter_expr = col("ID").is_in(lit(ids_series));
    // filter the dataframe on the expression
    let filtered = table.lazy().filter(filter_expr).collect().unwrap();
    println!("{:?}", filtered);
}

Note: you will need to add the features lazy and is_in

cargo add polars --features lazy,is_in

Cory Grinstead
  • 511
  • 3
  • 16
  • I'm new to rust and polars. Where I could find a list of these features in crate's documentation? I'm missing a lot of functions, and needing to rewrite this by hand and these functions may be already implemented as features. – sbb Jan 08 '23 at 18:43
  • the rust docs have a list of available [opt-in features](https://docs.rs/polars/latest/polars/#compile-times-and-opt-in-features) – Cory Grinstead Jan 08 '23 at 21:16
  • is there any "not_in" equivalent? – sbb Jan 08 '23 at 23:22
  • there is a `not` method on expressions. This negates any previous boolean expression. so `.is_in(x).not()` would be the equivalent of `not_in` – Cory Grinstead Jan 09 '23 at 00:03