1

I'm currently designing a binary format for logging a series of protobuf encoded messages. I need to be able to read the messages newest to oldest (from the end of stream) as well as oldest to newest, so I believe that I can't use a length prefix as the docs suggest to delimit messages unless I want to seek through the entire log from beginning to end.

Is it safe to use ASCII control character separators (i.e. 28 file separator, 29 group separator, 30 record separator, 31 unit separator) to delimit protobuf messages?

Alternatively, is it unheard of to use something like a length suffix in addition to a length prefix to create a message sandwich (length message length) to allow reading through the messages both forwards and backwards?

This question is similar to this one, but the latter does not mention the use-case of reading messages backwards, nor does it deal with protobuf messages specifically.

brainkim
  • 902
  • 3
  • 11
  • 20

2 Answers2

1

Is it safe to use ASCII control character separators

No, basically. A protobuf-net can contain any conceivable byte sequence.

A length suffix would work; unusual, but workable.

Marc Gravell
  • 1,026,079
  • 266
  • 2,566
  • 2,900
1

Length suffix works fine for reading from either end.

If you also needed the ability to seek to the middle of the stream and find the start of the next message, you could use Consistent Overhead Byte Stuffing to free up one character, such as 0x00, to be used as a delimiter.

jpa
  • 10,351
  • 1
  • 28
  • 45