Kubeflow Training Operator provides Kubernetes custom resources that makes it easy to run distributed or non-distributed TensorFlow/PyTorch/Apache MXNet/XGBoost/MPI jobs on Kubernetes.
Questions tagged [kubeflow]
433 questions
2
votes
1 answer
microk8s not running after installation
I want to install kubeflow using microk8s on kubernetes cluster, but I faced a problem with microk8s. I already install microk8s using this link. So, when I tried to see the status on microk8s, it was said not running
microk8s is not running. Use…

MADFROST
- 1,043
- 2
- 11
- 29
2
votes
0 answers
Kubeflow dashboard returns 403 Forbidden
I have a problem with Kubeflow Dashboard. Until now I could connect to the dashboard without problems, but after a restart of the PC it gives me forbidden when I try to connect from my browser to http://10.64.140.43.nip.io (this is the url received…

Ioana Ciangau
- 21
- 1
2
votes
0 answers
How to use Kubeflow volume mount with outputPath parameter?
I am building a Kubeflow pipeline that has 2 components. Component 1 preprocesses some data and component 2 performs model training on that data. I understand I need to save the data at some outputPath parameter generated by Kubeflow. This works. I…

Zach
- 113
- 1
- 9
2
votes
0 answers
Kubeflow: Notebook server stuck on loading
Whenever I try to create a Kubeflow notebook server to build a pipeline from a jupyter notebook, it keeps loading forever without displaying any error.
I'm currently using a Kubeflow dashboard that's already up and running on a server, so I didn't…

camelia
- 41
- 4
2
votes
3 answers
Kubeflow - error in create_run_from_pipeline_func
I'm new to Kubeflow and k8s. I have setup a single node k8s cluster and installed Kubeflow on this. I'm now trying the 'conditional pipeline' simple example from "Kubeflow for Machine Learning" book but I am getting "cannot post…

soumeng78
- 600
- 7
- 12
2
votes
1 answer
istio-ingressgateway always Waiting for Istio Pilot information
I'm trying to deploy kubeflow on and OVH managed k8 cluster.
After the initial setup of the k8 cluster, I ran the following commands to install kubeflow, as suggested here:
# install
snap install juju --classic
# get cluster name (should be…

Preston
- 7,399
- 8
- 54
- 84
2
votes
2 answers
What is the best option for build kubeflow components?
I am read about Kubeflow, and for create components there are two ways.
Container-Based
Function-Based
But there isn't an explication about why I should to use one or another, for example for load a Container-based, I need to generate a docker…

Tlaloc-ES
- 4,825
- 7
- 38
- 84
2
votes
1 answer
how to log metrics using kubeflow on google ai platform notebooks
I am building ml models using google cloud platform's ai platform notebooks.
I know if I use ai platform jobs, it logs hyperparameters, metrics, etc with nice visualization but is there a way to create the same or similar structure so that I can log…

Jack Smith
- 71
- 6
2
votes
0 answers
Is it possible to use artifacts as source for visualisations in Kubeflow pipelines
I'm experimenting with Kubeflow on minikube and I try to use the visualizations feature of the Kubeflow pipeline UI.
The documentation states that you should generate a mlpipeline-ui-metadata.json file and add it to the ContainerOp outputs.
This…

alberthier
- 643
- 1
- 7
- 9
2
votes
0 answers
Logout from Kubeflow application with Auth0 causing infinite loop
I am trying to setup authentication to Kubeflow with Auth0, following this manual: Authentication using OIDC (with the difference, I setup google account instead of github as a IdP)
Now I am able to login with my Google account to kubeflow via auth0…

Vadim Yangunaev
- 1,817
- 1
- 18
- 41
2
votes
1 answer
Instantiate and Shutdown Kubeflow pods
I'm learning about Kubernetes and Kubeflow, and there's something that I want to do that I'm not finding any clear answer on the internet on if it's possible or the route I should take.
When training my machine learning model, I want to use a large…

João Areias
- 1,192
- 11
- 41
2
votes
0 answers
Add GPU to Kubeflow cluster on GKE
I am struggling to add a GPU to my GKE Kubeflow cluster. The documentation still references kfctl and some old set-up parameters. (To be precise, I added a T4 GPU to the GKE cluster successfully, but my notebook server fails to start).
Has anyone…

OlgaPp
- 180
- 11
2
votes
1 answer
How to resolve the "ERROR No Major.Minor.Patch elements found" during ksonnet init step in AWS EKS setup
I'm following the official AWS EKS tutorial on setting up a distributed GPU cluster for Tensorflow model training and am hitting a bit of a snag.
After creating a new cluster using eksctl and verifying that the corresponding ~/.kube/config file…

StuartBernis
- 21
- 2
2
votes
2 answers
ParallelFor in Kubeflow Pipelines
I'd like to use a custom list to run parallel Ops in a Kubeflow Pipeline, and I want to use the value of the element of the list into the definition of the Op. I'm trying something like this:
my_list = ['foo', 'bar']
with dsl.ParallelFor(my_list) as…

Matteo Felici
- 1,037
- 10
- 19
2
votes
1 answer
Orchestrating TFX Pipelines with Kubeflow locally
Hey I am working on a package which generates a TFX Pipelines for training GPT-2 (see https://github.com/steven-mi/tfx-gpt2).
I was wondering how I am able to deploy my pipeline to Kubeflow locally. Is there any in depth guide for doing so?

stmi
- 21
- 2