2

protoc-gen-go generates something like this, at the end of the generated go files:


var fileDescriptor_13c75530f718feb4 = []byte{
    // 2516 bytes of a gzipped FileDescriptorProto
    0x1f, 0x8b, 0x08, 0x00, 0x00, 0x00, 0x00, 0x00, 0x02, 0xff, 0xbc, 0x59, 0xdf, 0x6f, 0x1c, 0x47,
...
}

I want to read it in plaintext for debugging purposes. How to do that?

Why I want it - a small change that should not produce change in this generated file does, and I am figuring out why (and it is hard to debug, as it's just a binary blob).

Karel Bílek
  • 36,467
  • 31
  • 94
  • 149

2 Answers2

3

I wrote a code like this, to parse and print the blob.

The key logic is actually from https://github.com/grpc/grpc-go/blob/759de4dd00c25745b6f3d7a9fdfb32beaf1d838e/reflection/serverreflection.go#L202-L226

package main

import (
    "bytes"
    "compress/gzip"
    "encoding/json"
    "fmt"

    "io/ioutil"

    proto "github.com/golang/protobuf/proto"
    dpb "github.com/golang/protobuf/protoc-gen-go/descriptor"
    _ [here write path to your generated go source]
    // include the line above if you want to use proto.FileDescriptor,
    // leave if you just copy-paste the bytes below
)

func main() {
    // here write the path that is used in the generated file
    // in init(), as an argument to proto.RegisterFile 
    // (or just copypaste the bytes instead of using proto.FileDescriptor)
    bytes := proto.FileDescriptor(XXX)

    fd, err := decodeFileDesc(bytes)
    if err != nil {
        panic(err)
    }
    b, err := json.MarshalIndent(fd,"","  ")
    if err != nil {
        panic(err)
    }
    fmt.Println(string(b))
}

// decompress does gzip decompression.
func decompress(b []byte) ([]byte, error) {
    r, err := gzip.NewReader(bytes.NewReader(b))
    if err != nil {
        return nil, fmt.Errorf("bad gzipped descriptor: %v", err)
    }
    out, err := ioutil.ReadAll(r)
    if err != nil {
        return nil, fmt.Errorf("bad gzipped descriptor: %v", err)
    }
    return out, nil
}

func decodeFileDesc(enc []byte) (*dpb.FileDescriptorProto, error) {
    raw, err := decompress(enc)
    if err != nil {
        return nil, fmt.Errorf("failed to decompress enc: %v", err)
    }

    fd := new(dpb.FileDescriptorProto)
    if err := proto.Unmarshal(raw, fd); err != nil {
        return nil, fmt.Errorf("bad descriptor: %v", err)
    }
    return fd, nil
}

This prints the data from the proto file, as a JSON.

As Marc Gravell mentions in the comment to the other answer, the gzip compression is non-deterministic, so the same proto file can create different gzipped FileDescriptorProto on two different computers.

Karel Bílek
  • 36,467
  • 31
  • 94
  • 149
2

a FileDescriptorProto is not plain text; it doesn't contain the original schema as text, but rather: it is a protobuf binary encoded instance of FileDescriptorProto as defined by descriptor.proto, containing the processed meaning of the original schema.

So; you could deserialize that payload (once de-gzipped) as a FileDescriptorProto, and use whatever reflection/metadata API is available in "go" to get that in some text form. If the go implementation of protobuf includes the protobuf json (rather than binary) API, you could just call the write-json API on the FileDescriptorProto instance. Note: not all protobuf implementations implement the json API.

Marc Gravell
  • 1,026,079
  • 266
  • 2,566
  • 2,900
  • Ohh. I will try to specify the original question then. A small change that should *not* produce change in this generated file *does*, and I am figuring out why (and it is hard to debug). – Karel Bílek Mar 05 '20 at 08:03
  • 1
    @KarelBílek well, do you have an example of this small change is, so I can offer an opinion on whether it should/shouldn't change the descriptor? also: note that gzip is not required to be deterministic - all that is required is that once decompressed you get the same data back out that you started with; there can be multiple ways of choosing compression strategies, and "roll a dice" might be a perfectly valid strategy for choosing in some cases! (from memory, when parallelism is involved, it can also depend on in what *order* the state tables get updated); i.e. blocks A->B->C vs A->C->B – Marc Gravell Mar 05 '20 at 08:14
  • You are right, parsing the code back shows exactly the same results, so it's just gzip being un-deterministic. I will write the code itself as an answer. – Karel Bílek Mar 05 '20 at 08:48