1

After migrating existing code from Protobuf (specifically: Protobuf LITE) to FlatBuffers I'm now at the situation where assessing the performance of both is important (before hopefully retiring Protobuf)...but with results not in the way I expected.

The IDL-design/schema of the message type(s) are practically the same (in fact, the first version of my FlatBuffers schema was auto-derived from the Protobuf one using the flatc compiler option --proto). The schema has a root-type of a simple table containing 3 strings and an array of another simple key-value tables (becoming a std::vector in the generated/compiled C++). That key-value table: a string key name followed by an int, float, double or string.

enum __Type : int { fb_UNKNOWN = 0, fb_INT = 1, fb_FLOAT = 2, fb_DOUBLE = 3, fb_STRING = 4 }

table __KeyValuePair (native_custom_alloc: "fb_custom_allocator")
{
   key: string;

   int_Type: __Type = fb_UNKNOWN;

   int_Value: int;
   float_Value: float;
   double_Value: double;
   string_Value: string;
}


table __TradingFloorEvent (native_custom_alloc: "fb_custom_allocator")
{
   str_Trader: string;
   str_Exchange: string;
   str_Currency: string;

   vec_KeyValuePairs: [__KeyValuePair];
}

root_type __TradingFloorEvent ;

So, to the FlatBuffers-aware folks, this is not a complicated schema. You'll also see that I've chosen a custom allocator - behind the scenes it uses the header-only boost::pool to re-use previous memory allocations: it works perfectly and has already shown impressive improvements in performance when building up the object (before serialization).

Another possibly important piece of info: I'm generating the C++ code with the --gen-object-api option (which is necessary to allow the use of the custom allocator anyway), meaning that serialization and deserialization are now achieved using FlatBuffer's Pack() and UnPack() functions.

The problem: serialization of a TradingFloorEvent that contains a lot (say, 100) of key-value pairs is disturbingly slow in comparison to Protobuf - and the more key-value pairs in the array/vector the worse it gets, sometimes 10 times slower.

FYI: I'm on MS-Windows using Visual Studio 2022 with performance assessed using "Release" builds.

My FlatBuffers "builder" is initialized with 1Mb (largest serialized buffer, so far, is just 10k), so pre-allocation of that ultimate buffer should be a given. So what could be going wrong ? Could there be expensive behind-the-scene allocations that are not evident to me, could there be other FlatBuffers schema options that might help, could my developer head be completely fried and in need of a hard reboot ?

  • 1
    Using `--gen-object-api` will kill any performance benefits of flatbuffers. That API is for ease-of-use and is not performance minded. If you are concerned with performance, you'll be better to do the base flatbuffer API. – Moop Apr 19 '22 at 16:17
  • 1
    What @Moop said. In addition, this schema is paying a big cost to subvert the strongly typed nature of FlatBuffers by storing generic key/value pairs. That's not what it is intended for, and it won't be fast this way. Try turning your keys into field names, and your values into precisely typed types for that field, preferably not a lazy "string" but scalar types and enums where possible. If your data is mostly key-value pairs that can't be statically typed, you're better off with a data format made for key-value pairs, like FlexBuffers. – Aardappel Apr 19 '22 at 16:37

0 Answers0