I have deployed a trained PyTorch model to a Google Vertex AI Prediction endpoint. The endpoint is working fine, giving me predictions, but when I examine its logs in Logs Explorer, I see:
INFO 2023-01-11T10:34:53.270885171Z Number of GPUs: 0
INFO 2023-01-11T10:34:53.270888834Z Number of CPUs: 4
This is despite the fact that I set the endpoint to use NVIDIA_TESLA_T4
as the accelerator type:
Why does the log show 0 GPUs and does this mean TorchServe is not taking advantage of the accelerator GPU?