2

I'm writing a program using Apache Arrow C++ library to extract metadata from a parquet file, and I've been having a lot of trouble finding documentation and examples.
After some try and error I managed to do the job using this code:

std::unique_ptr<parquet::RowGroupMetaData> rowGroup = file_metadata->RowGroup(0);

std::shared_ptr<parquet::Statistics> stats = (rowGroup->ColumnChunk(0))->statistics();
    
const parquet::TypedStatistics<arrow::Int32Type> *typed_stats =
    static_cast<const parquet::TypedStatistics<arrow::Int32Type>*>(stats.get());
        
std::cout << "Min(" << i << "): " << typed_stats->min() << std::endl;
std::cout << "Max(" << i << "): " << typed_stats->max() << std::endl;

Is there a different/better way of doing this?
And is there a way of automatically detect the type or will I have to write if/else or switch statements for each type?

I'm using Apache Arrow 11.0 and g++ 11.3.0

0x26res
  • 11,925
  • 11
  • 54
  • 108
Alberto Pires
  • 319
  • 1
  • 5
  • 10
  • Maybe you can use simply `auto typed_stats = stats.get()`? – kiner_shah Mar 08 '23 at 10:15
  • Didn't work, it can assign the value to typed_stats but I can't access the methods: (error: ‘class parquet::Statistics’ has no member named ‘min’) – Alberto Pires Mar 08 '23 at 19:38
  • Ohh I see, you need the cast then. Unfortunately I don't know if there can be a better solution, maybe there is some clean solution to this. – kiner_shah Mar 09 '23 at 05:57

0 Answers0