3

I am trying to run Kubeflow Pipelines with the new Vertex AI on GCP.

Previously, in Kubeflow Pipelines, I was able to use the Run ID in my Pipeline by utilizing dsl.RUN_ID_PLACEHOLDER or {{workflow.uid}}. My understanding was that dsl.RUN_ID_PLACEHOLDER would resolve to {{workflow,uid}} at compile time, and then at run time, the {{workflow.uid}} tag would be resolved to the Run's ID. This is at least how it has worked in my experience using Kubeflow Pipelines and the Kubeflow Pipelines UI.

However, when I try to access the Run ID in a similar way in a pipeline that I run in Vertex AI Pipelines, it seems that dsl.RUN_ID_PLACEHOLDER resolves to {{workflow.uid}} but that this never subsequently resolves to the ID of the run.

I created the following Test Pipeline, which tries to get the Run ID using the DSL Placeholder, then uses a lightweight component to print out the value of the run_id parameter of the pipeline. The result of running the pipeline in the UI is that the print_run_id component prints {{workflow.uid}}, where as on Kubeflow Pipelines previously, it would have resolved to the Run ID.

from kfp import dsl
from kfp import components as comp
import logging
from kfp.v2.dsl import (
    component,
    Input,
    Output,
    Dataset,
    Metrics,
)

@component
def print_run_id(run_id:str):
    print(run_id)

RUN_ID = dsl.RUN_ID_PLACEHOLDER

@dsl.pipeline(
    name='end-to-end-pipeline',
    description='End to end XGBoost cover type training pipeline'
)
def end_to_end_pipeline(
    run_id: str = RUN_ID
):
    print_task = print_run_id(run_id=run_id)

Is there a way to access the Run ID using the KFP SDK with Vertex AI Pipelines?

ML6-Liam
  • 41
  • 5
  • Did you try to change all `from kfp` to `from kfp.v2`? Its just part of your code and you've also set up your Google Cloud project and development environment? – PjoterS Jul 13 '21 at 09:02
  • Thanks, I updated to kfp.v2, which I should have done to start with! I now get an error from my Deploy pipeline script indicating that RUN_ID_PLACEHOLDER is not an attribute of kfp.v2.dsl, is there anything mimicking this behavior in Vertex AI? Being able to attain the Run ID like this has been very helpful – ML6-Liam Jul 14 '21 at 14:39
  • 2
    It's really hard to find any information about it and this is not covered by GCP documentation. I've created an Issue tracker at Google which can be found [here](https://issuetracker.google.com/issues/193880774). Product team is already checking this. Please click on to let google employees know that you are affected by this issue. You can keep track of the status by following the above issue tracker. – PjoterS Jul 16 '21 at 13:48

2 Answers2

0

What works on vertex.ai are different magic strings.

Specifically:

 from kfp.v2 import dsl

 id = dsl.PIPELINE_JOB_ID_PLACEHOLDER
 name = dsl.PIPELINE_JOB_NAME_PLACEHOLDER

see also, https://github.com/kubeflow/pipelines/blob/master/sdk/python/kfp/v2/dsl/__init__.py

Just got this answer from our support at google.

0

This isn't documented well. Trying to access them during the pipeline build or from within individuals components directly will return the unmodified placeholder string values. But it does work during the actual pipeline run (at least for kfp v2)

Example for kfp v2 based on this link that works as expected.

import kfp.v2.dsl as dsl

@component
def print_op(msg: str, value: str):
    print(msg, value) # <-- Prints the correct value

@component
def incorrect_print_op():
    print(dsl.PIPELINE_JOB_NAME_PLACEHOLDER) # <-- Prints incorrect placeholder value

@dsl.pipeline(name='pipeline-with-placeholders')
def my_pipeline():

    # Correct
    print_op(msg='job name:', value=dsl.PIPELINE_JOB_NAME_PLACEHOLDER)
    print_op(msg='job resource name:', value=dsl.PIPELINE_JOB_RESOURCE_NAME_PLACEHOLDER)
    print_op(msg='job id:', value=dsl.PIPELINE_JOB_ID_PLACEHOLDER)
    print_op(msg='task name:', value=dsl.PIPELINE_TASK_NAME_PLACEHOLDER)
    print_op(msg='task id:', value=dsl.PIPELINE_TASK_ID_PLACEHOLDER)
    
    # Incorrect
    print(dsl.PIPELINE_TASK_ID_PLACEHOLDER) # Prints only placeholder value

greedybuddha
  • 7,488
  • 3
  • 36
  • 50