1

I used arrow2 (specifically, io-odbc) to interact with a database. I saved the data as parquet with datatype Vec<Result<Chunk<Box<dyn Array>>>>. Example code below

pub fn write_batch(
    path: &str,
    schema: Schema,
    columns: Vec<Result<Chunk<Box<dyn Array>>>>,
) -> Result<()> {
    let options = WriteOptions {
        write_statistics: true,
        compression: CompressionOptions::Uncompressed,
        version: arrow2::io::parquet::write::Version::V2,
    };
    let encodings = schema
        .fields
        .iter()
        .map(|f| transverse(&f.data_type, |_| Encoding::Plain))
        .collect();
    let row_groups = RowGroupIterator::try_new(columns.into_iter(), &schema, options, encodings)?;
    let file = std::fs::File::create(path)?;
    let mut writer = FileWriter::try_new(file, schema, options)?;

    for group in row_groups {
        writer.write(group?)?;
    }
    let _size = writer.end(None)?;
    Ok(())
}

If I have Vec<Result<Chunk<Box<dyn Array how do I convert this type to Polars dataframe? Or more specifically Result<Chunk<Box<dyn Array to Polars ChunkedArray?

katrocitus
  • 45
  • 1
  • 6

0 Answers0