I'm working on a little utility app, coming from python/pandas and trying to rebuild some basic tools that can be distributed via executables. I'm having a hard time interpreting the documentation for what seems like it should be a fairly simple process of reading some raw data, resampling it based on the datetime column, and then interpolating it to fill in missing data as necessary.
My cargo.toml looks like:
[dependencies]
polars = "0.19.0"
And the code I've written so far is:
use polars::prelude::*;
use std::fs::File;
fn main() {
let mut df = CsvReader::new("raw.csv".into())
.finish();
//interpolate to clean up blank/nan
//resample/groupby 15Min-1D using mean, blank/nan if missing
let mut file = File::create("final.csv").expect("File not written!!!");
CsvWriter::new(&mut file)
.has_header(true)
.with_delimiter(b',')
.finish(&df);
}
and the raw.csv data might look like:
site,datetime,val1,val2,val3,val4,val5,val6
XX1,2021-01-01 00:45,,,,4.60,,
XX1,2021-01-01 00:50,,,,2.30,,
XX1,2021-01-01 00:53,21.90,16.00,77.67,3.45,1027.20,0.00
XX1,2021-01-01 01:20,,,,4.60,,
XX1,2021-01-01 01:53,21.90,16.00,77.67,3.45,1026.90,0.00
XX1,2021-01-01 01:55,,,,0.00,,
XX1,2021-01-01 02:00,,,,0.00,,
XX1,2021-01-01 02:45,,,,5.75,,
XX1,2021-01-01 02:50,,,,8.05,,
XX1,2021-01-01 02:53,21.00,16.00,80.69,8.05,1026.80,0.00
But I can't seem to call the methods because I get errors like:
method not found in `Result<DataFrame, PolarsError>`
or
expected struct `DataFrame`, found enum `Result`
and I'm not sure how to properly shift between classes.
I've tried obviously wrong answers like:
let grouped = df.lazy().groupby_dynamic("datetime", "1h").agg("datetime", mean());
but basically, I'm looking for the polars equivalent of pandas code:
df = df.interpolate()
df = df.resample(sample_frequency).mean()
Any help would be appreciated!