1

i'm working with some tensorflow code and trying to load a trained checkpoint, but it's failing with a protobuf error like this:

[libprotobuf ERROR google/protobuf/wire_format_lite.cc:577] String field 'tensorflow.TensorShapeProto.Dim.name' contains invalid UTF-8 data when parsing a protocol buffer. Use the 'bytes' type if you intend to send raw bytes. 
Traceback (most recent call last):
  [...]
  File "/home/sopi/miniconda3/envs/magenta2/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 3053, in _as_graph_def
    graph.ParseFromString(compat.as_bytes(data))
google.protobuf.message.DecodeError: Error parsing message

in order to debug the training code that apparently is producing invalid utf-8, i'd like to know what the invalid data in question actually looks like. stepping through the code in pdb doesn't get me very far since ParseFromString() is implemented in C++.

how can i find out what the invalid utf-8 data is? or even the position in the byte array at which the error occurred?

(in this case, graph is a tensorflow.core.framework.graph_pb2.GraphDef, which is a subclass of google.protobuf.message.Message. but my question concerns protobuf parsing in general and i don't think there's anything special about GraphDef in this respect)

ahihi
  • 73
  • 1
  • 5

0 Answers0