0

Assume that I am doing something with each row of a Parquet file and each row has a field named myList which is repeated and string. How can I get the last value in the myList of each row?

This example uses a vector to store all values. Is there any convenient way to get the last value of the repeated field in each row directly?

My code is like this:

auto chunk_array = table->GetColumnByName(myList);
auto list = std::static_pointer_cast<arrow::ListArray>(chunk_array->chunk(0));
for (int cur_row = 0; cur_row < table->num_rows(); ++cur_row) {
    //to get the last value of myList in current row
}

thanks~

1 Answers1

0

I sovled it by the code below eventually:

auto chunk_array = table->GetColumnByName(myList);
auto list = std::static_pointer_cast<arrow::ListArray>(chunk_array->chunk(0));
int l_offset1, l_offset2, l_gap;
for (int cur_row = 0; cur_row < table->num_rows(); ++cur_row) {
    l_offset1 = list->value_offset(cur_row);
    l_offset2 = list->value_offset(cur_row + 1);
    l_gap = l_offset2 > l_offset1 ? l_offset2 - l_offset1 : 1;
    real_offset = real_offset + l_gap - 1;
    auto varr = std::static_pointer_cast<arrow::Int64Array>(list->values());
    varr->Value(real_offset);
    real_offset += 1;
}