1

I currently have two protobuf repos: api and timestamp:

timestamp Repo:

- README.md
- timestamp.proto
- timestamp.pb.go
- go.mod
- go.sum

api Repo:

- README.md
- protos/
  - dto1.proto
  - dto2.proto

Currently, timestamp contains a reference to a timestamp object that I want to use in api but I'm not sure how the import should work or how I should modify the compilation process to handle this. Complicating this process is the fact that the api repo is compiled to a separate, downstream repo for Go called api-go.

For example, consider dto1.proto:

syntax = "proto3";
package api.data;

import "<WHAT GOES HERE?>";

option go_package = "github.com/my-user/api/data"; // golang

message DTO1 {
    string id = 1;
    Timestamp timestamp = 2;
}

And my compilation command is this:

find $GEN_PROTO_DIR -type f -name "*.proto" -exec protoc \
    --go_out=$GEN_OUT_DIR --go_opt=module=github.com/my-user/api-go \
    --go-grpc_out=$GEN_OUT_DIR --go-grpc_opt=module=github.com/my-user/api-go \
    --grpc-gateway_out=$GEN_OUT_DIR --grpc-gateway_opt logtostderr=true \
    --grpc-gateway_opt paths=source_relative --grpc-gateway_opt 
    generate_unbound_methods=true \{} \;

Assuming I have a definition in timestamp for each of the programming languages I want to compile api into, how would I import this into the .proto file and what should I do to ensure that the import doesn't break in my downstream repo?

Woody1193
  • 7,252
  • 5
  • 40
  • 90

1 Answers1

5

There is no native notion of remote import paths with protobuf. So the import path has to be relative to some indicated local filesystem base path (specified via -I / --proto_path).

Option 1

Generally it is easiest to just have a single repository with protobuf definitions for your organisation - e.g. a repository named acme-contract

.
└── protos
    └── acme
        ├── api
        │   └── data
        │       ├── dto1.proto
        │       └── dto2.proto
        └── timestamp
            └── timestamp.proto

Your dto1.proto will look something like:

syntax = "proto3";

package acme.api.data;

import "acme/timestamp/timestamp.proto";

message DTO1 {
  string id = 1;
  acme.timestamp.Timestamp timestamp = 2;
}

As long as you generate code relative to the protos/ dir of this repository, there shouldn't be an issue.

Option 2

There are various alternatives whereby you continue to have definitions split over various repositories, but you can't really escape the fact that imports are filesystem relative.

Historically that could be handled by manually cloning the various repositories and arranging directories such that the path are relative, or by using -I to point to various locations that might intentionally or incidentally contain the proto files (e.g. in $GOPATH). Those strategies tend to end up being fairly messy and difficult to maintain.

buf makes things somewhat easier now. If you were to have your timestamp repo:

.
├── buf.gen.yaml
├── buf.work.yaml
├── gen
│   └── acme
│       └── timestamp
│           └── timestamp.pb.go
├── go.mod
├── go.sum
└── protos
    ├── acme
    │   └── timestamp
    │       └── timestamp.proto
    ├── buf.lock
    └── buf.yaml

timestamp.proto looking like:

syntax = "proto3";

package acme.timestamp;

option go_package = "github.com/my-user/timestamp/gen/acme/timestamp";

message Timestamp {
  int64 unix = 1;
}

buf.gen.yaml looking like:

version: v1
plugins:
  - name: go
    out: gen
    opt: paths=source_relative
  - name: go-grpc
    out: gen
    opt:
      - paths=source_relative
      - require_unimplemented_servers=false
  - name: grpc-gateway
    out: gen
    opt:
      - paths=source_relative
      - generate_unbound_methods=true

... and everything under gen/ has been generated via buf generate.

Then in your api repository:

.
├── buf.gen.yaml
├── buf.work.yaml
├── gen
│   └── acme
│       └── api
│           └── data
│               ├── dto1.pb.go
│               └── dto2.pb.go
└── protos
    ├── acme
    │   └── api
    │       └── data
    │           ├── dto1.proto
    │           └── dto2.proto
    ├── buf.lock
    └── buf.yaml

With buf.yaml looking like:

version: v1
name: buf.build/your-user/api
deps:
  - buf.build/your-user/timestamp
breaking:
  use:
    - FILE
lint:
  use:
    - DEFAULT

dto1.proto looking like:

syntax = "proto3";

package acme.api.data;

import "acme/timestamp/timestamp.proto";

option go_package = "github.com/your-user/api/gen/acme/api/data";

message DTO1 {
  string id = 1;
  acme.timestamp.Timestamp timestamp = 2;
}

and buf.gen.yaml the same as in the timestamp repo.

The code generated via buf generate will depend on the timestamp repository via Go modules:

// Code generated by protoc-gen-go. DO NOT EDIT.
// versions:
//  protoc-gen-go v1.28.1
//  protoc        (unknown)
// source: acme/api/data/dto1.proto

package data

import (
    timestamp "github.com/your-user/timestamp/gen/acme/timestamp"
    protoreflect "google.golang.org/protobuf/reflect/protoreflect"
    protoimpl "google.golang.org/protobuf/runtime/protoimpl"
    reflect "reflect"
    sync "sync"
)

// <snip>

Note that if changes are made to dependencies you'll need to ensure that both buf and Go modules are kept relatively in sync.

Option 3

If you prefer not to leverage Go modules for importing generated pb code, you could also look to have a similar setup to Option 2, but instead generate all code into a separate repository (similar to what you're doing now, by the sounds of it). This is most easily achieved by using buf managed mode, which will essentially make it not require + ignore any go_modules directives.

In api-go:

.
├── buf.gen.yaml
├── go.mod
└── go.sum

With buf.gen.yaml containing:

version: v1
managed:
  enabled: true
  go_package_prefix:
    default: github.com/your-user/api-go/gen
plugins:
  - name: go
    out: gen
    opt: paths=source_relative
  - name: go-grpc
    out: gen
    opt:
      - paths=source_relative
      - require_unimplemented_servers=false
  - name: grpc-gateway
    out: gen
    opt:
      - paths=source_relative
      - generate_unbound_methods=true

You'd then need to generate code for each respective repo (bushed to BSR):

$ buf generate buf.build/your-user/api
$ buf generate buf.build/your-user/timestamp

After which you should have some generated code for both:

.
├── buf.gen.yaml
├── gen
│   └── acme
│       ├── api
│       │   └── data
│       │       ├── dto1.pb.go
│       │       └── dto2.pb.go
│       └── timestamp
│           └── timestamp.pb.go
├── go.mod
└── go.sum

And the imports will be relative to the current module:

// Code generated by protoc-gen-go. DO NOT EDIT.
// versions:
//  protoc-gen-go v1.28.1
//  protoc        (unknown)
// source: acme/api/data/dto1.proto

package data

import (
    timestamp "github.com/your-user/api-go/gen/acme/timestamp"
    protoreflect "google.golang.org/protobuf/reflect/protoreflect"
    protoimpl "google.golang.org/protobuf/runtime/protoimpl"
    reflect "reflect"
    sync "sync"
)

// <snip>

All in all, I'd recommend Option 1 - consolidating your protobuf definitions into a single repository (including vendoring 3rd party definitions) - unless there is a particularly strong reason not to.

nj_
  • 2,219
  • 1
  • 10
  • 12
  • This was an excellent answer, thank you. I think I'd like to use Option 2 but you've added a lot of buf files and I'm not sure what each should contain or what their purpose is. – Woody1193 Nov 01 '22 at 03:16
  • I'm not sure how buf is actually capable of finding my timestamp repo? Do I have to push it to the BSR? – Woody1193 Nov 01 '22 at 06:01
  • The [buf docs](https://docs.buf.build/introduction) are really good, and should hopefully give you a reasonable idea of what those files represent. I can share some examples still, but I'd really recommend familiarising yourself with the tool. – nj_ Nov 01 '22 at 09:56
  • 1
    And yes, a push to BSR would be required if you were to run with this option. You can still achieve Option 2 without `buf`, but you'd need to have some sort of setup where the timestamp repo is readable from the local fs, and you're supplying a `-I` flag to have `protoc` be able to resolve the proto files contained within that repo. At a previous job we'd end up with a script that would temporarily clone the dependent repo, run protoc, and then wipe the temporary clone after - it was messy but worked. But `buf` certainly makes it easier. – nj_ Nov 01 '22 at 09:58
  • That's what I was afraid of. I suppose I'll try for the redesign then – Woody1193 Nov 01 '22 at 23:49