0

I have an existing TFX pipeline here that I want to rewrite using the KubeFlow Pipelines SDK.

The existing pipeline is using many TFX Standard Components such as ExampleValidator. When checking the KubeFlow SDK, I see a kfp.components.package but no existing prebuilt components like TFX provides.

Does the KubeFlow SDK have an equivalent to the TFX Standard Components?

pirateofebay
  • 930
  • 1
  • 10
  • 25

2 Answers2

1

You don’t have to rewrite the components, there is no mapping of components of tfx in kfp, as they are not competitive tools.

With tfx you create the components and then you use an orchestrator to run them. Kubeflow pipelines is one of the orchestrators.

The tfx.orchestration.pipeline will wrap your tfx components and create your pipeline.

We have two schedulers behind kubeflow pipelines: Argo (used by gcp) and Tekton (used by openshift). There are examples for tfx with kubeflow pipelines using tekton and tfx with kubeflow pipelines using argo in the respective repositories.

Theofilos Papapanagiotou
  • 5,133
  • 1
  • 18
  • 24
1

Actually Kubeflow does have a notion of reusable components that they reference in the docs. They can be python-based or YAML-based and so on. However, there is no 'standard' ones like TFX has them. You can just see a bunch of them in the examples repo, and create your own reusable ones.

You can sort of treat TFX components and Kubeflow components somewhat interchangeably though, as TFX components do get compiled into the Kubeflow representation via the orchestrator logic. Simply use the KubeflowDagRunner with your TFX pipelines. However I might be missing something: What is your motivation to re-write in Kubeflow?

Hamza Tahir
  • 416
  • 5
  • 13