1

I'm trying to use Google Protocol Buffers to read multiple messages from a file. The documentation suggests using CodedInputStream.

But if I try and read more than a very small message I get a failure from MergeFromCodedStream

For example, if I have a message defined as:

message Chunk {
  repeated int64 values = 1 [packed=true];
}

And try to write the message to file and then read it back:

int main() {
  GOOGLE_PROTOBUF_VERIFY_VERSION;
  {
      Chunk chunk;
      for (int i = 0; i != 26; ++i)
        chunk.add_values(i);

      std::ofstream output("D:\\temp.bin");
      OstreamOutputStream raw_output(&output);

      if (!writeDelimitedTo(chunk, &raw_output)){
        std::cout << "Unable to write chunk\n";
        return 1;
      }
  }
  {
    std::ifstream input("D:\\temp.bin");
    IstreamInputStream raw_input(&input);
    Chunk in_chunk;

    if (!readDelimitedFrom(&raw_input, &in_chunk)) { // <--- Fails here
      std::cout << "Unable to read chunk\n";
      return 1;
    }

    std::cout << "Num values in chunk " << in_chunk.values_size() << "\n";
  }

  google::protobuf::ShutdownProtobufLibrary();
}

where writeDelimitedTo and readDelimitedFrom come from this answer by the author of the C++ protobuf libraries:

bool writeDelimitedTo(
  const google::protobuf::MessageLite& message,
  google::protobuf::io::ZeroCopyOutputStream* rawOutput) {
  google::protobuf::io::CodedOutputStream output(rawOutput);

  const int size = message.ByteSize();
  output.WriteVarint32(size);

  uint8_t* buffer = output.GetDirectBufferForNBytesAndAdvance(size);
  if (buffer != NULL) {
    message.SerializeWithCachedSizesToArray(buffer);
  } else {
    message.SerializeWithCachedSizes(&output);
    if (output.HadError()) return false;
  }

  return true;
}

bool readDelimitedFrom(
  google::protobuf::io::ZeroCopyInputStream* rawInput,
  google::protobuf::MessageLite* message) {
  google::protobuf::io::CodedInputStream input(rawInput);

  uint32_t size;
  if (!input.ReadVarint32(&size)) return false;

  google::protobuf::io::CodedInputStream::Limit limit =
    input.PushLimit(size);

  if (!message->MergeFromCodedStream(&input)) return false; // <-- Fails here
  if (!input.ConsumedEntireMessage()) return false;

  input.PopLimit(limit);

  return true;
}

if i only write 25 values to my message it works, 26 and it fails. I've shown where it is failing in the code.

I've tried debugging into the protobuf library and it seems to be failing to read new data into the buffer but I don't know why.

I'm using Visual Studio 2013 and protobuf 2.6.1.

Community
  • 1
  • 1
Chris Drew
  • 14,926
  • 3
  • 34
  • 54
  • Are you saying that you change the loop at the top of the `main()` so that the limit is changed from `i != 26;` to something else such as `i != 30;` and you can not read the 30 values you added? Are there any error codes produced by the `MergeFromCodedStream()` function? What happens if you write a smaller number of values to your message say 22? By the way I suggest that you modify your two functions `writeDelimitedTo()` and `readDelimitedFrom()` to return a specific error code which will indicate at what point the function fails rather than just returning a `bool`. – Richard Chambers Jun 13 '15 at 12:19
  • @RichardChambers Basically, yes. If I change the loop to `i != 25` it works as expected. As written, with `i != 26` it fails. `MergeFromCodedStream` returns no error codes as far as I know. Just a bool to indicate success or failure. – Chris Drew Jun 13 '15 at 12:30
  • 2
    You probably need to add `std::ios::binary` line-ending flags on your `std::ofstream` and `std::ifstream` constructors. – rhashimoto Jun 13 '15 at 14:32
  • @rhashimoto Ah, of course! Thanks! – Chris Drew Jun 13 '15 at 14:38
  • This example using C++ shows using `std::ios::trunc` as well as `std::ios::binary` when creating and writing the output file. https://developers.google.com/protocol-buffers/docs/cpptutorial – Richard Chambers Jun 13 '15 at 16:44
  • @RichardChambers I think if a file is opened with `ofstream` `std::ios::trunc` [is the default](http://stackoverflow.com/a/10875173/3422652). – Chris Drew Jun 13 '15 at 19:16

1 Answers1

0

As @rashimoto correctly pointed out I was failing to open my files in binary mode!

With that fixed I can successfully write multiple messages to file:

int main() {
  GOOGLE_PROTOBUF_VERIFY_VERSION;
  {
    std::vector<Chunk> chunks = createChunks(NUM_CHUNKS, CHUNK_SIZE);

    std::ofstream output("D:\\temp.bin", std::ios::binary);
    OstreamOutputStream raw_output(&output);

    for (Chunk& chunk : chunks) {
      if (!writeDelimitedTo(chunk, &raw_output)){
        std::cout << "Unable to write chunk\n";
        return 1;
      }
    }
  }
  {
    std::ifstream input("D:\\temp.bin", std::ios::binary);
    IstreamInputStream raw_input(&input);
    std::vector<Chunk> chunks(NUM_CHUNKS);

    for (auto& chunk : chunks) {
      if (!readDelimitedFrom(&raw_input, &chunk)) {
        std::cout << "Unable to read chunk\n";
        return 1;
      }
    }

    std::cout << "Num values in first chunk " << chunks[0].values_size() << "\n";
  }

  google::protobuf::ShutdownProtobufLibrary();
}
Chris Drew
  • 14,926
  • 3
  • 34
  • 54