3

The use case is that we think about triggering an Argo Workflow via Argo Events with PubSub. PubSub does not guarantee that a message is delivered only once. Is there an easy method to prevent a Workflow from being triggered again when it is running already?

Something like the concurrencyPolicy setting for CronWorkflows.

To have something to look at - let's assume the whalesay Workflow:

apiVersion: argoproj.io/v1alpha1
kind: Workflow                  # new type of k8s spec
metadata:
  name: hello-world    # name of the workflow spec
  namespace: argo
spec:
  entrypoint: whalesay          # invoke the whalesay template
  templates:
  - name: whalesay              # name of the template
    container:
      image: docker/whalesay
      command: [cowsay]
      args: ["hello world"]
      resources:                # limit the resources
        limits:
          memory: 32Mi
          cpu: 100m

I found the following two promising issues - but I fail to extract the solution for this problem.

crenshaw-dev
  • 7,504
  • 3
  • 45
  • 81
Raffael
  • 19,547
  • 15
  • 82
  • 160

1 Answers1

3

If you just need to make sure the Workflow doesn't run more than one simultaneous instance, use Argo's built-in synchronization feature.

apiVersion: v1
kind: ConfigMap
metadata:
 name: my-config
data:
  workflow: "1"  # Only one workflow can run at given time in particular namespace

---

apiVersion: argoproj.io/v1alpha1
kind: Workflow 
metadata:
  name: hello-world
spec:
  entrypoint: whalesay
  synchronization:
    semaphore:
      configMapKeyRef:
        name: my-config
        key: workflow
  templates:
  - name: whalesay
    container:
      image: docker/whalesay
# ...

If you want to avoid processing the same message twice, you could add a step to the workflow to exit early if the message ID is in a database (or something along those lines).

crenshaw-dev
  • 7,504
  • 3
  • 45
  • 81
  • I think that's also the solution suggested in my linked github discussions. It feels a bit "magical" - maybe it's because I'm not sufficiently acquainted yet with K8s and Argo. – Raffael Feb 23 '21 at 18:30
  • 1
    lol fair, I hate magic too. Basically, a Workflow (as far as Kubernetes is concerned) is pretty much just a pile of yaml. The Argo workflow-controller Pod is responsible for seeing when that yaml shows up and doing something with it (spinning up Pods, etc.). The workflow-controller Pod keeps track of how many Workflows have claimed a semaphore. If a Workflow shows up that exceeds the claim limit, the workflow-controller just puts the Workflow on ice until a claim frees up. – crenshaw-dev Feb 23 '21 at 18:34