3

Common practice for using protobufs over the wire, including by gRPC, is to length prefix protobuf messages into frames (e.g. like this) so that the decoder knows when one message stops and the next starts.

This seems unnecessary. According to the spec, a protobuf message is comprised of a sequence of tags followed by values:

message := (tag value)*
tag := (field << 3) bit-or wire_type

Once a tag is read, the length of the value is then known and the parser can parse the value in its entirety without needing more metadata. Thus, the length is only needed to figure out if at any given point there are more tags left to be parsed in the message.

An obvious non-length solution presents itself: null (0x0) termination. Field indices start at 1 so tag can never be 0; a flat INT encoding for field = 1 with a VARINT wire type produces 0b1000 = 8 and VARINT encoding will always set the MSB on the first byte and thus will always begin with a nonzero byte. Thus, if:

  • the parser is in between tag-value pairs and
  • encounters a null byte

it follows that this byte is not part of the rest of the protobuf message and thus a corresponding action (such as terminating the message) may be taken.

All of this seems very obvious given the protobuf specification, so am I simply missing some detail that breaks this?

Another way of phrasing the question is if there's a case where you will get a 0x0 tag within a valid message? It appears that nanopb would use NULL termination for a while and only stopped due to issues with debugging broken encoders.

ckfinite
  • 427
  • 2
  • 6

3 Answers3

2

The ultimate problem here is that it isn't defined as such in the specification, and any such change would break all existing implementations. The wire format doesn't allow for it, and encoding sub-messages in a null-suffixed way would require a new "wire type" so that decoders know to look for nulls in a forwards field-by-field way.

However! Something very similar already does exist; "groups". The wire specification defines groups as a prefix and suffix sentinel pair, akin to { and } in JSON. Not much more expensive than a null byte, although it needs two of them and they include the field number, but: virtually all decoders already know how to handle group encoding, even though they have been almost erased from sight.

To be clear: I'm not talking about the group concept from proto2 - I'm only talking about the wire specification. I would, for example, propose using instead a modifier in regular fields definitions, applicable only to message types.

I have tried to petition for their resurrection as an alternative mechanism for writing messages without needing to compute lengths, but: I have been unsuccessful so far. See: https://github.com/protocolbuffers/protobuf/issues/9134

Marc Gravell
  • 1,026,079
  • 266
  • 2,566
  • 2,900
  • For some additional context here, I'm interested in top-level message framing, not sub-object framing (I was actually looking at your answer [here](https://stackoverflow.com/a/52338751/5419369) when doing background research for this question). As a result, I don't think that this should require a wire format change since top-level message framing isn't a part of the format specification in the first place? The proposal for sub-objects makes a lot of sense - computing lengths is a pain. Unfortunate about the lack of interest. – ckfinite Mar 02 '23 at 22:44
  • @ckfinite ah, sorry, I feel I have wasted many words then! Sub-message framing: isn't defined by the protocol. At the root level, some decoders *will* stop at a null byte, but it isn't universal precisely because it isn't defined. You mention gRPC, but gRPC has single-message payloads with framing defined at the RPC level, so no sub-message framing is required additional to that – Marc Gravell Mar 02 '23 at 22:53
  • That makes sense in that case - it just seems strange to me that at the root level it isn't more common/in the standard to go "there's a 0x0 at the end of the message." If null bytes are undesirable then another invalid tag like 0b00000111 could be used as well. This seems like a straightforward way to delimit top-level messages; in combination with the sub-object framing suggestion you made, it seems like most length computations could be eliminated completely with this scheme? – ckfinite Mar 02 '23 at 22:57
0

Just to provide some follow up, a broader examination of the concept suggests that it:

  • Could technically work - nothing in the spec says you can't do this.
  • Doesn't work practically because (as Marc Gravell points out) client implementations will look at random NULLs that they run into after a value in all sorts of wacky and wonderful ways and you don't have much control over it.

In a case where the client parser is known and supports it (such as if you're using nanopb) then it could be a viable alternative. If you're trying to support a range of clients, however, it's not a good idea as it's hard to impose on libraries that are not looking for it.

ckfinite
  • 427
  • 2
  • 6
0

Another way of phrasing the question is if there's a case where you will get a 0x0 tag within a valid message?

No, 0 tag is reserved by specification and protoc will refuse to compile a message using it. Therefore zero byte doesn't appear in tag position in a valid message.

Zero can appear elsewhere in the message, so reading data up to the termination requires code that is aware of the protobuf format.

It appears that nanopb would use NULL termination for a while and only stopped due to issues with debugging broken encoders.

Nanopb still supports 0-termination, but it is now enabled by separate flag PB_DECODE_NULLTERMINATED and corresponding PB_ENCODE_NULLTERMINATED. The reason for separating this was to be more robust in detecting corrupted messages when the message length is known.

jpa
  • 10,351
  • 1
  • 28
  • 45