My application uses protocol buffers and has a large number (100 million) of simple messages. Based on callgrind analysis, a memory allocation & deallocation is being made for each instance.
Consider the following representative example:
// .proto
syntax = "proto2";
package testpb;
message Top {
message Nested {
optional int32 val1 = 1;
optional int32 val2 = 2;
optional int32 val3 = 3;
}
repeated Nested data = 1;
}
// .cpp
void test()
{
testpb::Top top;
for (int i = 0; i < 100'000; ++i) {
auto* data = top.add_data();
data->set_val1(i);
data->set_val2(i*2);
data->set_val3(i*3);
}
std::ofstream ofs{"file.out", std::ios::out | std::ios::trunc | std::ios::binary };
top.SerializeToOstream(&ofs);
}
What is the most effective option for changing the implementation such that the # of memory allocations are not linear with the # of Nested
instances?