0

I am trying to filter a LazyFrame using a chrono::NaiveDateTime range. Here is where I currently am at:

use polars::prelude::*;
use polars_lazy::prelude::*;

pub fn keep_range_lazy(
    df: &mut DataFrame,
    start: &NaiveDateTime,
    end: &NaiveDateTime,
) -> Result<(), PolarsError> {
    assert!(
        df.get_column_names().contains(&"timestamp"),
        "Dataframe does not contain timestamp column."
    );

    df = &mut df
        .lazy()
        .filter(
            col("timestamp")
                .dt()
                .datetime()
                .gt_eq(start)
                .and(col("timestamp").dt().datetime().lt(end)),
        )
        .collect()?;

    Ok(())
}

The above code fails because the start and end variables are not convertible to Expr types:

error[E0277]: the trait bound `polars_lazy::dsl::Expr: std::convert::From<&chrono::NaiveDateTime>` is not satisfied
   --> src/utils.rs:32:24
    |
32  |                 .gt_eq(start)
    |                  ----- ^^^^^ the trait `std::convert::From<&chrono::NaiveDateTime>` is not implemented for `polars_lazy::dsl::Expr`
    |                  |
    |                  required by a bound introduced by this call
    |
    = help: the following other types implement trait `std::convert::From<T>`:
              <polars_lazy::dsl::Expr as std::convert::From<&str>>
              <polars_lazy::dsl::Expr as std::convert::From<bool>>
              <polars_lazy::dsl::Expr as std::convert::From<f32>>
              <polars_lazy::dsl::Expr as std::convert::From<f64>>
              <polars_lazy::dsl::Expr as std::convert::From<i32>>
              <polars_lazy::dsl::Expr as std::convert::From<i64>>
              <polars_lazy::dsl::Expr as std::convert::From<polars_lazy::dsl::AggExpr>>
              <polars_lazy::dsl::Expr as std::convert::From<u32>>
              <polars_lazy::dsl::Expr as std::convert::From<u64>>
    = note: required for `&chrono::NaiveDateTime` to implement `std::convert::Into<polars_lazy::dsl::Expr>`
note: required by a bound in `polars_plan::dsl::<impl polars_lazy::dsl::Expr>::gt_eq`
   --> /home/username/.cargo/registry/src/github.com-1ecc6299db9ec823/polars-plan-0.28.0/src/dsl/mod.rs:258:21
    |
258 |     pub fn gt_eq<E: Into<Expr>>(self, other: E) -> Expr {
    |                     ^^^^^^^^^^ required by this bound in `polars_plan::dsl::<impl Expr>::gt_eq`

Notes:

  • I have seen this answer which does not suit me because it implies using .hours().minutes().seconds() whilst there should be a way to simply use a single DateTime variable.
  • I have this other answer which does not suit me either because it uses a DataFrame instead of a LazyFrame.
  • The solution does not especially have to be inline. Meaning the final signature of that function could very well be pub fn get_range_lazy(df: DataFrame, start: &NaiveDateTime, end: &NaiveDateTime) -> Result<DataFrame, PolarsError> if it does not imply a performance loss.

Here is the doc of the polars DSL.

Pierre
  • 3
  • 1

1 Answers1

0

I love the compiler errors in rust, they are so instructive as to the problem, and often even point to a solution! Like in this case, where they are telling you clearly that the trait bound is not satisfied but there are numerous implementations from integer types. To use the NaiveDatetime type we need only to cast the values to an appropriate integer value first. For example:

df.filter(
    col("timestamp")
        .gt_eq(start.timestamp_millis())
        .and(
            col("timestamp").lt(end.timestamp_millis()),
        ),
)

where I have used timestamp_millis() here as "timestamp" is in ms--your case may vary, of course.

Matt
  • 183
  • 1
  • 6
  • Thank you for your answer, it indeed works, but I don't understand how it knows how to compare a `NaiveDateTime` with an `i64` in your code and not in the following code: `let d1 = NaiveDateTime::parse_from_str("01/01/2022 21:16:00.00", "%m/%d/%Y %H:%M:%S%.3f").unwrap();` `let d2 = NaiveDateTime::parse_from_str("12/30/2022 21:16:00.00", "%m/%d/%Y %H:%M:%S%.3f").unwrap();` `date < d1.timestamp_micros()` Could you please explain what is the difference with your example above ? – Pierre Apr 19 '23 at 09:14
  • Comparison operators like `<` are not an `Expr`, you need to use something like `col("foo").lt(lit(date))`, or whatever, keeping in mind that under-the-hood `DataType::Datetime` is an integer. See https://docs.rs/polars/latest/polars/prelude/enum.Expr.html – Matt Apr 21 '23 at 23:14