2

I am building a customized pipeline with the following step:

     trainer_task = (trainer(download_task.output).set_cpu_request("16").set_memory_request("60G").
add_node_selector_constraint('cloud.google.com/gke-accelerator', "NVIDIA_TESLA_K80").set_gpu_limit(2))

However, when I check the number of GPU's available it says zero, and the only visible device is a CPU device. I am migrating a project from Kubeflow, and this is the first time using Vertex AI, so I am not pretty sure why this is happening.

The involved step is component that loads a docker image from the Artifact Registry and installs Tensorflow-gpu==2.4.1.

Am I missing something? Why is not enabling the specified GPUs?

Any help will be highly appreciated!

spalacio
  • 21
  • 1
  • The command itself is correct. I would check to ensure that the image you specified actually installs the CUDA drivers correctly. Maybe set up a vm with that image to test – greedybuddha Feb 25 '22 at 11:58
  • 1
    Hey, did you figure this out? I have a similar issue. – skeller88 Sep 07 '22 at 21:48

0 Answers0