4

I'd like to be able to train a model with a training app container that I've made and saved to my artifact registry. I want to be able to deploy a model with a flask app and with a /predict route that can handle some logic -- not necessarily just predicting an input json. It'll also need a /healthz route I understand. So basically I want a pipeline that performs a training job on a model training container that I make, and deploys the model with a flask app with a model serving container that I make. Looking around on Overflow, I wonder if this question's pipeline has the correct layout I'll eventually want to have. So, something like this:

import kfp
from kfp.v2 import compiler
from kfp.v2.dsl import component
from kfp.v2.google import experimental
from google.cloud import aiplatform
from google_cloud_pipeline_components import aiplatform as gcc_aip

@kfp.dsl.pipeline(name=pipeline_name, pipeline_root=pipeline_root_path)
def pipeline():
        training_job_run_op = gcc_aip.CustomPythonPackageTrainingJobRunOp(
            project=project_id,
            display_name=training_job_name,
            model_display_name=model_display_name,
            python_package_gcs_uri=python_package_gcs_uri,
            python_module=python_module,
            container_uri=container_uri,
            staging_bucket=staging_bucket,
            model_serving_container_image_uri=model_serving_container_image_uri)

        # Upload model
        model_upload_op = gcc_aip.ModelUploadOp(
            project=project_id,
            display_name=model_display_name,
            artifact_uri=output_dir,
            serving_container_image_uri=model_serving_container_image_uri,
        )
        model_upload_op.after(training_job_run_op)

        # Deploy model
        model_deploy_op = gcc_aip.ModelDeployOp(
            project=project_id,
            model=model_upload_op.outputs["model"],
            endpoint=aiplatform.Endpoint(
                endpoint_name='0000000000').resource_name,
            deployed_model_display_name=model_display_name,
            machine_type="n1-standard-2",
            traffic_percentage=100)

    compiler.Compiler().compile(pipeline_func=pipeline,
                                package_path=pipeline_spec_path)

I'm hoping that model_serving_container_image_uri and serving_container_image_uri both refer to the URI for the model serving container I'm going to make. I've already made a training container that trains a model and saves saved_model.pb to Google Cloud Storage. Other than having a flask app that handles the prediction and health check routes and a Dockerfile that exposes a port for the flask app, what else will I need to do to ensure the model serving container works in this pipeline? Where in the code do I install the model from GCS? In the Dockerfile? How is the model serving container meant to work so that everything will go swimmingly in the construction of the pipeline? I'm having trouble finding any tutorials or examples of precisely what I'm trying to do anywhere even though this seems like a pretty common scenario.

To that end, I attempted this with the following pipeline:

import kfp
from kfp.v2 import compiler
from kfp.v2.dsl import component
from kfp.v2.google import experimental
from google.cloud import aiplatform
from google_cloud_pipeline_components import aiplatform as gcc_aip

@kfp.dsl.pipeline(name=pipeline_name, pipeline_root=pipeline_root_path)
def pipeline(
        project: str = [redacted project ID],
        display_name: str = "custom-pipe",
        model_display_name: str = "test_model",
        training_container_uri: str = "us-central1-docker.pkg.dev/[redacted project ID]/custom-training-test",
        model_serving_container_image_uri: str = "us-central1-docker.pkg.dev/[redacted project ID]/custom-model-serving-test",
        model_serving_container_predict_route: str = "/predict",
        model_serving_container_health_route: str = "/healthz",
        model_serving_container_ports: str = "8080"
):
        training_job_run_op = gcc_aip.CustomContainerTrainingJobRunOp(
            display_name = display_name,
            container_uri=training_container_uri,
            model_serving_container_image_uri=model_serving_container_image_uri,
            model_serving_container_predict_route = model_serving_container_predict_route,
            model_serving_container_health_route = model_serving_container_health_route,
            model_serving_container_ports = model_serving_container_ports)

        # Upload model
        model_upload_op = gcc_aip.ModelUploadOp(
            project=project,
            display_name=model_display_name,
            serving_container_image_uri=model_serving_container_image_uri,
        )
        model_upload_op.after(training_job_run_op)

        # Deploy model
#        model_deploy_op = gcc_aip.ModelDeployOp(
#            project=project,
#            model=model_upload_op.outputs["model"],
#            endpoint=aiplatform.Endpoint(
#                endpoint_name='0000000000').resource_name,
#            deployed_model_display_name=model_display_name,
#            machine_type="n1-standard-2",
#            traffic_percentage=100)

Which is failing with

google.api_core.exceptions.PermissionDenied: 403 Permission 'aiplatform.trainingPipelines.create' denied on resource '//aiplatform.googleapis.com/projects/u15c36a5b7a72fabfp-tp/locations/us-central1' (or it may not exist).

Despite the fact that my service account has the Viewer and Kubernetes Engine Admin roles needed to work AI Platform pipelines. My training container uploads my model to Google Cloud Storage and my model serving container I've made downloads it and uses it for serving at /predict.