6

SDK: Apache Beam SDK for Go 0.5.0

We are running Apache Beam Go SDK jobs in Google Cloud Data Flow. They had been working fine until recently when they intermittently stopped working (no changes made to code or config). The error that occurs is:

Failed to retrieve staged files: failed to retrieve worker in 3 attempts: bad MD5 for /var/opt/google/staged/worker: ..., want ; bad MD5 for /var/opt/google/staged/worker: ..., want ;

(Note: It seems as if it's missing a second hash value in the error message message.)

As best I can guess there's something wrong with the worker - It seems to be trying to compare md5 hashes of the worker and missing one of the values? I don't know exactly what it's comparing to though.

Does anybody know what could be causing this issue?

Tim
  • 2,667
  • 4
  • 32
  • 39
  • Some additional notes: The error has the path `/var/opt/google/staged/worker` but when I SSH into the VM the only path I can see is `/var/opt/google/dataflow/staged/worker` - The binary seems to match the expected size but I'm not sure why the paths are different? – Tim Dec 18 '18 at 23:47
  • I installed the [version go1.12](https://golang.org/doc/install) and I executed the [Beam Go SDK Quickstart](https://beam.apache.org/get-started/quickstart-go/); it worked. You error could be an issue with version Go 0.5.0 that now it's fixed, unless your program has a specific code that is causing it. By the way, Beam SDK for Go is not in the list of programming languages supported by Dataflow, only [Java, Python and REST are supported](https://cloud.google.com/dataflow/docs/apis). – rsantiago Sep 26 '19 at 17:28

2 Answers2

1

The fix to this issue seems to have been to rebuild the worker_harness_container_image with the latest changes. I had tried this but I didn't have the latest release when I built it locally. After I pulled the latest from the Beam repo, and rebuilt the image (As per the notes here https://github.com/apache/beam/blob/master/sdks/CONTAINERS.md) and reran it seemed to work again.

Tim
  • 2,667
  • 4
  • 32
  • 39
0

I'm seeing the same thing. If I look into the Stackdriver logging I see this:

Handler for GET /v1.27/images/apache-docker-beam-snapshots-docker.bintray.io/beam/go:20180515/json returned error: No such image: apache-docker-beam-snapshots-docker.bintray.io/beam/go:20180515

However, I can pull the image just fine locally. Any ideas why Dataflow cannot pull.

florianrosenberg
  • 159
  • 2
  • 10