2

Google's Protocol Buffers implementation contains a TextFormat class, which is able to serialize Messages to/from text.

How stable is this text format? Specifically:

  • If I serialize a proto2 defined message to UTF8, will any other version of Google's Protocol Buffers implementation in the same language be able to deserialize it given the same proto2 message definition?
  • Is this still true if we're talking about google-published implementations in different languages?
Chuu
  • 4,301
  • 2
  • 28
  • 54
  • I think you question is covered by the answer [here](http://stackoverflow.com/questions/6604929/data-format-compatibility-between-protobuf-versions?rq=1). – psv Dec 09 '15 at 13:11
  • @petersv That's about the binary format used over the wire. I'm asking about the text format. – Chuu Dec 09 '15 at 15:49

1 Answers1

4

Yes, protobuf "text format" is the same across all implementations. You can call toString() in Java and then parse it using TextFormat in C++, etc.

Note, however, that text format is intended for communications where one end (sender or receiver) is a human. For computer-to-computer communications, you should always use binary format. Text format has some important differences that make sense when talking to a human but not between computers:

  • When an unknown field name is seen in text input, it is an error. In contrast, with binary format, unknown fields are ignored for forwards-compatibility. In text format, though, the assumption is that an unknown name is probably a typo on the part of the human, and so would be dangerous to ignore.
  • Text format parsing and writing is much, much slower than binary format. It is implemented in terms of reflection interfaces rather than generated code, and it is not well-optimized.
Kenton Varda
  • 41,353
  • 8
  • 121
  • 105