I'm using Vertex AI batch predictions using a custom XGBoost model with Explainable AI using Shapley values.
The explanation part is quite computationally intensive so I've tried to split up the input dataset into chunks and submit 5 batch prediction jobs in parallel. When I do this I receive a "Quota exhausted. Please reach to ai-platform-unified-feedback@google.com for batch prediction quota increase".
I don't understand why I'm hitting the quota. According to the docs there is a limit on the number of concurrent jobs for AutoML models but it doesn't mention custom models.
Is the quota perhaps on the number of instances the batch predictions are running on? I'm using a n1-standard-8 instance for my predictions.
I've tried changing the instance type and launching fewer jobs in parallel but still getting the same error.