2

I created a Vertex AI pipeline to perform a simple ML flow of creating a dataset, training a model on it and then predicting on the test set. There is a python function based component (train-logistic-model) where I train the model. However, in the component I specify an invalid package and hence the step in the pipeline fails. I know this because when I corrected the package name the step worked fine. However, for the failed pipeline I am unable to see any logs. When I click on the "VIEW JOB" under "Execution Info" on the pipeline Runtime Graph (pic attached) it takes me to the "CUSTOM JOB" page which the pipeline ran. There is a message:

Custom job failed with error message: The replica workerpool0-0 exited with a non-zero status of 1 ...

When I click the VIEW LOGS button, it takes me to the Logs Explorer where there are NO logs. Why are there no logs? Do I need to enable logging somewhere in the pipeline for this? Or could it be a permission issue (it does not mention anything about it though, just this message on the Logs Explorer and 0 logs below it.

Showing logs for time specified in query. To view more results update your query

enter image description here

racerX
  • 930
  • 9
  • 25
  • Are the logs are visible via GCP Cloud Logging, composing manually query in logs explorer? – vitooh Dec 15 '21 at 16:19
  • No they are not, there is simply the header "Showing logs for time specified in query. To view more results update your query" and nothing below it – racerX Dec 15 '21 at 22:17
  • I think you can raise it as bug in Public Issue Tracker. In Vertex it is done by "Send Feedback" button in documentation ([instruction](https://cloud.google.com/vertex-ai/docs/support/getting-support#file_bugs_or_feature_requests)). If you have support package you can raise support ticket as well. – vitooh Dec 30 '21 at 09:21

1 Answers1

0

Find the pipeline job id in the component logs and paste it in the below code

from google.cloud import aiplatform

from collections import namedtuple

import json

import time

def get_status_helper(client):

response = client.get_hyperparameter_tuning_job(
        name=training_job.metadata["resource_name"])

job_status = str(response.state)

return job_status

api_endpoint = f"{location}-aiplatform.googleapis.com"

client_options = {"api_endpoint": api_endpoint}

client = aiplatform.gapic.JobServiceClient(client_options=client_options)

client.get_custom_job(name="projects/{project-id}/locations/{your-location}/customJobs/{pipeline-id}")

Sample name or pipeline job id for reference:

========================================

projects/123456789101/locations/us-central1/customJobs/23456789101234567892

Above name can be found in the component logs