4

I am working on a small proof-of-concept project for my company and would like to use Argo Workflows to automate some data engineering tasks. It's really easy to get set up and I've been able to create a number of workflows that process data that is stored in a Docker image or is retrieved from a REST API. However, to work with our sensitive data I would like to mount a hostPath persistent volume to one of my workflow tasks. When I follow the documentation I don't get the desired behavior, the directory appears empty.

OS: Ubuntu 18.04.4 LTS
Kubernetes executor: Minikube v1.20.0
Kubernetes version: v1.20.2
Argo Workflows version: v3.1.0-rc4

My persistent volume (claim) looks like this:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: argo-volume
  labels:
    type: local
spec:
  storageClassName: manual
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: "/tmp/data"
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: argo-hello
spec:
  storageClassName: manual
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi

and I run kubectl -n argo apply -f pv.yaml

My workflow looks as follows:

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: hello-world-volumes-
spec:
  entrypoint: dag-template
  arguments:
    parameters:
    - name: file
      value: /mnt/vol/test.txt 
  volumes:
  - name: datadir
    persistentVolumeClaim:
      claimName: argo-hello

  templates:
  - name: dag-template
    inputs:
      parameters:
      - name: file
    dag:
      tasks:
      - name: readFile
        arguments: 
          parameters: [{name: path, value: "{{inputs.parameters.file}}"}]
        template: read-file-template
      - name: print-message
        template: helloargo
        arguments: 
          parameters: [{name: msg, value: "{{tasks.readFile.outputs.result}}"}] 
        dependencies: [readFile]   
  
  - name: helloargo
    inputs:
      parameters:
      - name: msg
    container:
      image: lambertsbennett/helloargo
      args: ["-msg", "{{inputs.parameters.msg}}"]
  
  - name: read-file-template
    inputs:
      parameters:
      - name: path
    container:
      image: alpine:latest
      command: [sh, -c]
      args: ["find /mnt/vol; ls -a /mnt/vol"]
      volumeMounts:
      - name: datadir
        mountPath: /mnt/vol

When this workflow executes it just prints an empty directory even though I populated the host directory with files. Is there something I am fundamentally missing? Thanks for any help.

0 Answers0