1

I'm using a free trial account for TPU training my deep learning models with my billing account enabled and I still have more than $100 promotional credits in my account. 2 days ago my preemptible TPU was "preemtibled" in the middle of a training session.

Since then I have tried multiple times to create a new TPU in different regions but I always got the following error:

Creating TPU node "node-1" failed. Error: APPLICATION_ERROR;google.cloud.tpu.v1/Tpu.CreateNode;Quota limit 'TPUV2sPodPerProjectPerRegionForTPUAPI,TPUV2sPodPerProjectPerZoneForTPUAPI' has been exceeded. Limit: 0,0 in region us-central1,zone us-central1-a.;AppErrorCode=8;StartTimeMs=1591581190314;tcp;Deadline(sec)=59.972117786;ResFormat=UNCOMPRESSED;Originator=traffic-prod;Tag=cidc2cloud_project_number648364020234IncomingMethod/TpuEntityService.CreateTpu;ServerTimeSec=1.122048062;LogBytes=256;Non-FailFast;EffSecLevel=none;ReqFormat=UNCOMPRESSED;ReqID=7f67b6ac43d18f40;GlobalID=1fab9ceb307864dc;Server=[2002:a05:6600:906:b029:cc:7048:9e48]:4001

I thought it has something to do with my quotas so I checked my quotas and I saw my "Preemptible TPU v3 cores per project per region" and "Preemptible TPU v3 cores per project per zone" are all 0. Is this the reason I cannot create new TPUs? If this is the reason then how did I manage to create my old TPU? And most importantly, how do I fix this?

Sea Otter
  • 73
  • 1
  • 6

1 Answers1

0

FYI, the error indicates that you're attempting to create a v2 pod, but your description mentions v3.

In any case, you'll see this error when you are trying to create a node that you lack the quota for, so your suspicion is correct - you'll need to work within the confines of the quota that is available to you or request an increase.

chrislarkin
  • 116
  • 2
  • Thanks for your answer. I tried to create both v2 and v3 and always got the same error. My quotas is currently 0 for both v2 and v3 tpu. So why was I able to create the TPU that I was using before? – Sea Otter Jun 08 '20 at 08:45
  • Broadly, it sounds like either an issue with the command/actions you are taking to create the node or that your quotas have changed since you were last successful. I'd recommend going into IAM > Quotas, setting the filters "tpu" and "cores", and looking to see exactly which have nonzero values. It could be as simple as forgetting to flip the 'preemptible' bit at creation, but it's hard to say in your individual case. – chrislarkin Jun 09 '20 at 09:19
  • The error indicates that you are trying to create an on demand TPU pod (v2-32 or bigger slice) for which you don't have any quota. Could you please checkout the quota available to your project for v2-8 or v3-8 and try accordingly. – aman2930 Jun 09 '20 at 19:25