How to make a kubeflow pipeline step depend on multiple previous steps

Question

I am running several kf steps in parallel. When they all complete AND if they have all succeeded I would like to trigger a last final step. With my current implementation the last step triggers if any of the previous ones succeeds which is not what I intend.

I have been looking at the documentation but I could not find a straightforward way to do it. Could someone provide an example?

score 0 · Answer 1 · answered Sep 13 '22 at 14:34

If kubeflow knows about a dependency between your steps (eg. step B depends on the output of step A) then this will happen automatically. Otherwise, you can just do this

from kfp import components, dsl, Client

@components.func_to_container_op
def echo(text: str)-> str:
    print(text)
    return text

@dsl.pipeline("with-afters")
def my_pipeline():
    parallel_1 = echo("bish")
    parallel_2 = echo("bash")

    serial_1 = echo("bosh").after(parallel_1).after(parallel_2)

If you want to wait for a loop it's easy too:

@dsl.pipeline("with-loop")
def loop_pipeline():
    with dsl.ParallelFor(["bish", "bash"]) as word:
        loop_step = echo(word)

    serial_1 = echo("bosh").after(loop_step)

score -1 · Answer 2 · answered Sep 09 '22 at 06:12

You should return a Boolean output from each of them and then check whether they're all true in the downstream step. I don't know Kubeflow syntax off the top of my head but with ZenML (which you run on Kubeflow) it would look like:

@step
def step1() -> bool:
   if successful:
       return True
   return False

@step
def step2() -> bool:
   if successful:
       return True
   return False

@step
def step3() -> bool:
   if successful:
       return True
   return False

@step
def downstream(step1: book, step2: book, step3: bool) -> bool:
   if step1 and step2 and step3:
       # execute stuff
       return True
   return False

@pipeline
def p(step1, step2, step3, downstream):
   downstream(
      step1(),
      step2(),
      step3(),
   )

p(step1(), step2(), step3(), downstream ()).run()

In kubeflow pipelines there's no need to add the success flag. If a step errors, it will stop all downstream tasks that depend on it from executing. Additionally, you'll lose the ability to add retry logic because you're not telling kfp the truth about whether the step succeeded. See https://github.com/kubeflow/pipelines/blob/master/samples/core/retry/retry.py — Tom Clelford, Sep 13 '22 at 14:43
Ah I mis read the OP's question: You can also do this in ZenML but your answer was Kubeflow specific as requested, so good one! — Hamza Tahir, Sep 15 '22 at 20:43

How to make a kubeflow pipeline step depend on multiple previous steps

2 Answers2