4

I am trying to reproduce this tutorial to run a Flex Template on Dataflow.

When I submit the job, I can see it in the console but it's not started and marked as Queued. Does this mean that the job is submitted in a FlexRS mode ? How can I start directly the job after submitting it ?

farhawa
  • 10,120
  • 16
  • 49
  • 91

2 Answers2

5

The "Queued" status for Flex Template jobs means that your container is running on a VM to build the pipeline and start the job. If the job stays in Queued for more than a few minutes, that indicates the this process got stuck. You can view the logs for this VM in the Dataflow UI in the "Job Logs" section.

danielm
  • 3,000
  • 10
  • 15
  • Thank you Daniel, so the job have failed and when I checked the job logs as you suggest I found this : `Output from execution of subprocess: ..... OSError: \'git\' was not found\n ----------------------------------------\nERROR: Command errored out with exit status 1` – farhawa Dec 24 '20 at 10:25
  • Looks like your container doesn't contain all of the dependencies it needs in order to start the job. In this case, you'll just need to add a line to your dockerfile "RUN apt-get install git" – danielm Dec 29 '20 at 17:48
1

This looks like a bug, where an upstream dependency (pyarrow?) started requiring git in order to build, but the base image does not currently include git. I have filed an issue here: https://issuetracker.google.com/issues/176570473

Travis Webb
  • 14,688
  • 7
  • 55
  • 109