5

My image is 11.4 GB. I'm trying the following command:

gcloud beta ai-platform versions create v1 \
  --region=$REGION \
  --model=$MODEL \
  --machine-type=n1-highmem-2 \
  --image=$REGION-docker.pkg.dev/$PROJECT_ID/$REPO_NAME/$IMAGE_NAME \
  --ports=8000 

and it's failing with:

Error: model server never became ready. Please validate that your model file or container configuration are valid, Error details: model server never became ready: status: "False" last_transition_time { seconds: 1603932493 } reason: "ContainersNotReady" message: "containers with unready status: [online-prediction-be-test85b70d-v1fb1b0606]" type: "ContainersReady"

even though I can run the image locally with docker run. What do I need to change?

When I run without specifying a region and with verbosity set to DEBUG, I get:

RecursionError: maximum recursion depth exceeded while calling a Python object
ERROR: gcloud crashed (RecursionError): maximum recursion depth exceeded while calling a Python object

The Python stack trace is massive (last couple of lines):
  File "/googlecloudsdk/core/log.py", line 484, in ShowStructuredOutput
    show_messages = properties.VALUES.core.show_structured_logs.Get()
  File "/googlecloudsdk/core/properties.py", line 2380, in Get
    value = _GetProperty(self, named_configs.ActivePropertiesFile.Load(),
  File "/googlecloudsdk/core/properties.py", line 2679, in _GetProperty
    value = _GetPropertyWithoutDefault(prop, properties_file)
  File "/google-cloud-sdk/lib/googlecloudsdk/core/properties.py", line 2711, in _GetPropertyWithoutDefault
    value = _GetPropertyWithoutCallback(prop, properties_file)
  File "/google-cloud-sdk/lib/googlecloudsdk/core/properties.py", line 2741, in _GetPropertyWithoutCallback
    for value_flags in reversed(invocation_stack):

gkv
  • 360
  • 1
  • 8
  • Can you try removing the `--region` flag? Then add `--verbosity` to **debug** mode to have a more detailed information about the error. – Alex G Oct 30 '20 at 05:46

2 Answers2

0

Can't understand anything from this error message you get, but, i guess gcloud fails to run a health check against your container, can you specify it with: --health-route and see?

Tsvi Sabo
  • 575
  • 2
  • 11
0

Please specify the health route inside your http web server and include the --container-health-route=/healthz while uploading model. For example, if you are using Flask web server make sure to add the following in your flask app.

@app.route('/healthz')
def healthz():
    return "OK"

Vertex AI or google ai platform requires you to specify health check routine in your http web server. Please refer this to understand the requirements in detail.