What do your includes look like? That error message seems to suggest you are not including the right files. The full definition for arrow:Int64Builder
is in arrow/array/builder_primitive.h
but you can usually just include arrow/api.h
to get everything.
The following compiles for me:
#include <iostream>
#include <arrow/api.h>
arrow::Status Main() {
std::size_t rowCount = 5;
arrow::MemoryPool* pool = arrow::default_memory_pool();
std::vector<arrow::Int64Builder> builders;
for (std::size_t i = 0; i < 2; i++) {
arrow::Int64Builder tmp(pool);
ARROW_RETURN_NOT_OK(tmp.Reserve(rowCount));
builders.push_back(std::move(tmp));
}
return arrow::Status::OK();
}
int main() {
auto status = Main();
if (!status.ok()) {
std::cerr << "Err: " << status << std::endl;
return 1;
}
return 0;
}
One small change to your example is that builders don't have a copy constructor / can't be copied. So I had to std::move
it into the vector.
Also, if you want a single collection with many different types of builders then you probably want std::vector<std::unique_ptr<arrow::ArrayBuilder>>
and you'll need to construct your builders on the heap.
One challenge you may run into is the fact that the builders all have different signatures for the Append
method (e.g. the Int64Builder
has Append(long)
but the StringBuilder
has Append(arrow::util::string_view)
). As a result arrow::ArrayBuilder
doesn't really have any Append
methods (there are a few which take scalars, if you happen to already have your data as an Arrow C++ scalar). However, you can probably overcome this by casting to the appropriate type when you need to append.
Update:
If you really want to avoid casting and you know the schema ahead of time you could maybe do something along the lines of...
std::vector<std::function<arrow::Status(const Row&)>> append_funcs;
std::vector<std::shared_ptr<arrow::ArrayBuilder>> builders;
for (std::size_t i = 0; i < schema.fields().size(); i++) {
const auto& field = schema.fields()[i];
if (isInt32(field)) {
auto int_builder = std::make_shared<Int32Builder>();
append_funcs.push_back([int_builder] (const Row& row) ({
int val = row.GetCell<int>(i);
return int_builder->Append(val);
});
builders.push_back(std::move(int_builder));
} else if {
// Other types go here
}
}
// Later
for (const auto& row : rows) {
for (const auto& append_func : append_funcs) {
ARROW_RETURN_NOT_OK(append_func(row));
}
}
Note: I made up Row
because I have no idea what format your data is in originally. Also I made up isInt32
because I don't recall how to check that off the top of my head.
This uses shared_ptr
instead of unique_ptr
because you need two copies, one in the capture of the lambda and the other in the builders
array.