2

Hey I am working on a package which generates a TFX Pipelines for training GPT-2 (see https://github.com/steven-mi/tfx-gpt2).

I was wondering how I am able to deploy my pipeline to Kubeflow locally. Is there any in depth guide for doing so?

Community
  • 1
  • 1
stmi
  • 21
  • 2

1 Answers1

3

I was working on this a couple of months ago but got pulled off with other stuff. I was using the recipe below (not quite a script) to get KFP, TFX, and JupyterLab running on a Google Cloud VM, and IIRC I was able to deploy the TFX pipeline and run it. I'm using microk8s for the Kubernetes cluster. So work in progress, but for what it's worth here it is, maybe it will help:

sudo apt-get remove docker docker-engine docker.io containerd runc
sudo apt-get update
sudo apt-get install \
    apt-transport-https \
    ca-certificates \
    curl \
    gnupg-agent \
    software-properties-common
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo add-apt-repository \
   "deb [arch=amd64] https://download.docker.com/linux/ubuntu \
   $(lsb_release -cs) \
   stable"
sudo apt-get update
sudo apt-get install docker-ce docker-ce-cli containerd.io
sudo groupadd docker
sudo usermod -aG docker ${USER}

# K8s 1.14 is currently recommended for KFP
sudo snap install microk8s --channel=1.14 --classic
sudo snap alias microk8s.kubectl kubectl
sudo usermod -a -G microk8s $USER

(exit and log back in)

docker run -d -p 5000:5000 --restart=always --name registry registry:2

microk8s.enable dns dashboard storage
microk8s.enable kubeflow
export PIPELINE_VERSION=0.2.5
kubectl apply -k github.com/kubeflow/pipelines/manifests/kustomize/base/crds?ref=$PIPELINE_VERSION
kubectl wait --for condition=established --timeout=60s crd/applications.app.k8s.io
kubectl apply -k github.com/kubeflow/pipelines/manifests/kustomize/env/dev?ref=$PIPELINE_VERSION

sudo apt-get install python3-pip
sudo update-alternatives --install /usr/bin/python python /usr/bin/python3.6 1
sudo update-alternatives  --set python /usr/bin/python3.6
sudo update-alternatives --install /usr/bin/pip pip /usr/bin/pip3 1
sudo update-alternatives  --set pip /usr/bin/pip3
pip install --upgrade pip

export PATH=$PATH:~/.local/bin
pip install notebook
pip install jupyterlab

<Make public IP address static>

jupyter notebook --generate-config
Set a password (Optional):
python
from notebook.auth import passwd; passwd()
(remember the password, and save the generated password)

vi ~/.jupyter/jupyter_notebook_config.py
Enable:
    c.NotebookApp.ip = '*'
    c.NotebookApp.open_browser = False
    c.NotebookApp.port = 3389 # for Pantheon (normally 8888)
    c.NotebookApp.password = 'sha:generated password above'

pip install --no-cache-dir --upgrade tfx
git clone https://github.com/tensorflow/tfx.git
mkdir AIHub
cp tfx/docs/tutorials/tfx/template.ipynb AIHub
cd AIHub

(wait about 5-15 minutes)
kubectl describe configmap inverse-proxy-config -n kubeflow | grep googleusercontent.com
jupyter lab &
RCrowe
  • 363
  • 2
  • 9