1

I've honed my transformations in DataPrep, and am now trying to run the DataFlow job directly using gcloud CLI.

I've exported my template and template metadata file, and am trying to run them using gcloud dataflow jobs run and passing in the input & output locations as parameters.

I'm getting the error:

Template metadata regex '[ \t\n\x0B\f\r]*\{[ \t\n\x0B\f\r]*((.|\r|\n)*".*"[ \t\n\x0B\f\r]*:[ \t\n\x0B\f\r]*".*"(.|\r|\n)*){17}[ \t\n\x0B\f\r]*\}[ \t\n\x0B\f\r]*' was too large. Max size is 1000 but was 1187.

I've not specified this at the command line, so I know it's getting it from the metadata file - which is straight from DataPrep, unedited by me.

I have 17 input locations - one containing source data, all the others are lookups. There is a regex for each one, plus one extra.

If it's running when prompted by DataPrep, but won't run via CLI, am I missing something?

Adam Hopkinson
  • 28,281
  • 7
  • 65
  • 99

1 Answers1

0

In this case I'd suspect the root cause is a limitation in gcloud that is not present in the Dataflow API or Dataprep. The best thing to do in this case is to open a new Cloud Dataflow issue in the public tracker and provide details there.

James
  • 2,321
  • 14
  • 30