0

I struggle accessing the row-elements of a Frame.

One idea I have is to filter the dataframe down to a row, convert it to a vec or something similar and access the elements this way ?!

In Panadas I used to just use ".at / .loc / .iloc / etc."; with Polars in Rust I have no clue.

Any suggestions on what the proper way to do this is ?

Robert
  • 131
  • 1
  • 7
  • Have you tried the `get_row` method on `DataFrame`? – isaactfa Aug 16 '22 at 11:53
  • good idea ... but I don't have an index ... the row should be selected based on a specific value in a column? ... and then I do not know how to access the elements of the row-result ... – Robert Aug 16 '22 at 15:32
  • Then [`filter`](https://docs.rs/polars-lazy/0.23.2/polars_lazy/frame/struct.LazyFrame.html#method.filter) it down and then get the first row or whatever out. That's gonna be an iterator over [`AnyValue`s](https://docs.rs/polars-core/0.23.2/polars_core/datatypes/enum.AnyValue.html#) that you can use. – isaactfa Aug 16 '22 at 16:52

1 Answers1

0

Thanks to @isaactfa ... he got me onto the right track. I ended up getting the row not with "get_row" but rather with "get" ... this is probably due to my little RUST understanding (my 2nd week).

Here is a working code sample:

use polars::export::arrow::temporal_conversions::date32_to_date;

use polars::prelude::*;

fn main() -> Result<()> {
    let days = df!(
        "date_string" => &["1900-01-01", "1900-01-02", "1900-01-03", "1900-01-04", "1900-01-05",
        "1900-01-06", "1900-01-07", "1900-01-09", "1900-01-10"])?;

    let options = StrpTimeOptions {
        date_dtype: DataType::Date,   // the result column-datatype
        fmt: Some("%Y-%m-%d".into()), // the source format of the date-string
        strict: false,
        exact: true,
    };

    // convert date_string into dtype(date) and put into new column "date_type"
    // we convert the days DataFrame to a LazyFrame ...
    // because in my real-world example I am getting a LazyFrame
    let mut new_days_lf = days.lazy().with_column(
        col("date_string")
            .alias("date_type")
            .str()
            .strptime(options),
    );

    // Getting the weekday as a number:
    // This is what I wanted to do ... but I get a string result .. need u32
    // let o = GetOutput::from_type(DataType::Date);
    // new_days_lf = new_days_lf.with_column(
    //     col("date_type")
    //         .alias("weekday_number")
    //         .map(|x| Ok(x.strftime("%w").unwrap()), o.clone()),
    // );

    // This is the convoluted workaround for getting the weekday as a number
    let o = GetOutput::from_type(DataType::Date);
    new_days_lf = new_days_lf.with_column(col("date_type").alias("weekday_number").map(
        |x| {
            Ok(x.date()
                .unwrap()
                .clone()
                .into_iter()
                .map(|opt_name: Option<i32>| {
                    opt_name.map(|datum: i32| {
                        // println!("{:?}", datum);
                        date32_to_date(datum)
                            .format("%w")
                            .to_string()
                            .parse::<u32>()
                            .unwrap()
                    })
                })
                .collect::<UInt32Chunked>()
                .into_series())
        },
        o,
    ));

    new_days_lf = new_days_lf.with_column(
        col("weekday_number")
            .shift_and_fill(-1, 9999)
            .alias("next_weekday_number"),
    );

    // now we convert the LazyFrame into a normal DataFrame for further processing:
    let mut new_days_df = new_days_lf.collect()?;

    // convert the column to a series
    // to get a column by name we need to collect the LazyFrame into a normal DataFrame
    let col1 = new_days_df.column("weekday_number")?;

    // convert the column to a series
    let col2 = new_days_df.column("next_weekday_number")?;

    // now I can use series-arithmetics
    let diff = col2 - col1;

    // create a bool column based on "element == 2"
    // add bool column to DataFrame
    new_days_df.replace_or_add("weekday diff eq(2)", diff.equal(2)?.into_series());

    // could not figure out how to filter the eager frame ...
    let result = new_days_df
        .lazy()
        .filter(col("weekday diff eq(2)").eq(true))
        .collect()
        .unwrap();

    // could not figure out how to access ROW elements
    // thus I used "get" instead af of "get_row"
    // getting the date where diff is == 2 (true)
    let filtered_row = result.get(0).unwrap();

    // within the filtered_row get element with an index
    let date = filtered_row.get(0).unwrap();

    println!("\n{:?}", date);

    Ok(())
}
Robert
  • 131
  • 1
  • 7