6

My dataflow job has been failing since 7AM this morning with error:

Startup of the worker pool in zone europe-west3-c failed to bring up any of the desired 1 workers. ZONE_RESOURCE_POOL_EXHAUSTED: Instance '' creation failed: The zone 'projects//zones/europe-west3-c' does not have enough resources available to fulfill the request. Try a different zone, or try again later.

I tried to launch the job in europe-west3-a and europe-west3-b and I get same error. It's been well over 12 hours but this problem persists. I know this is not a general resource availability problem as I can create a new VM in that region without any problems.

I even have case open with Google Support but unfortunately they don't even read my ticket and simply reply with standard reply asking me to do things I've tried already.

Any idea what I can do here?

Update 1:

I tried to create a new job with --worker-machine-type=e2-standard-2 and that works. The problem seems to be related to their server-specified machine.

Update 2:

We are now going into day 2 of the problem in europe-west3. Our dev environment is in europe-west1 and this problem doesn't occur there.

marcoseu
  • 3,892
  • 2
  • 16
  • 35

1 Answers1

0

This error occurs due to current unavailability of Compute Engine resources like GPUs in that zone. This is not related to your Compute Engine quota. You can resolve the issue by creating the resource in another zone in the region or a different region.

You can read more information and different resolution regarding this error in this document

Nicole Lumod
  • 112
  • 5
  • if you read my question, you will see that I did try "creating the resource in another zone" and that did not help. I've also added an update that if I specify the machine type the problem is fixed. Also per my post "This error occurs due to current unavailability of Compute Engine resources like GPUs in that zone" is not addressing the point as I clearly mention that "I know this is not a general resource availability problem as I can create a new VM in that region without any problems.". Again, please read the post before answering the question – marcoseu Jul 07 '22 at 10:41