2

I have a python program launching a batch job. The job outputs a json file, I'd like to know what is the easiest way to get this result back to the python program that launched it.

So far I thought of these solutions:

  • Upload the json file to S3 (pretty heavy)

  • Display it in the pod logs then read the logs from the python program (pretty hacky/dirty)

  • Mount a PVC, launch a second pod with the same PVC, and create a shared disk between this pod and the job (pretty overkill)

The json file is pretty lightweight. Isn't there a solution to do something like adding some metadata to the pod when the job completes? The python program can then just poll those metadata.

E-Kami
  • 2,529
  • 5
  • 30
  • 50
  • Allow outbound traffic to the batch job and save the JSON to a NoSQL database like DynamoDB or MongoDB? – Tom McLean Nov 04 '22 at 08:07
  • That's also pretty heavy imo, I would like my initial python program to do that job instead of the job – E-Kami Nov 04 '22 at 16:35

1 Answers1

1

An easy way not involving any other databases/pods is to run the first pod as an init container, mount a volume that is shared in both containers and use the JSON file in the next python program. (Also, this approach does not need a persistent volume, just a shared one), see this example:

https://kubernetes.io/docs/tasks/access-application-cluster/communicate-containers-same-pod-shared-volume/

Also, depending on the complexity of these jobs, would recommend taking a look at Argo workflows or any dag-related job schedulers.

paltaa
  • 2,985
  • 13
  • 28