0

I am looking into Cloud Run Jobs, but I'm having trouble with the documentation and (rare) examples for the product.

I want to create a Cloud Run Job that runs on a Cloud Schedule. The number of tasks depends on the number of items in a table. I want to run a task for each row.

Given each task runs individually and there's no "parent" container holding the task information, the only solution that comes to mind is a limit 1 offset ${BATCH_TASK_INDEX} query to get one row for each task - which doesn't seem very efficient. Also, with this method I wouldn't know the task count before.

I've seen there's a newer product "Batch" which has a script for the job and a container for the task, which kind of works for my use case. But I fail to understand what Cloud Run Jobs is actually good for. Maybe someone can shed some light?

Patrick
  • 7,903
  • 11
  • 52
  • 87
  • 1
    Why not create a "parent" task which will spawn the actual tasks? As for use-cases, this page mentions some reasonable scenarios: https://codelabs.developers.google.com/codelabs/cloud-starting-cloudrun-jobs#0 – yedpodtrzitko Jun 13 '23 at 05:32
  • That's an interesting idea, thank you! It's very close to Batch then, so maybe easier to just go with that one. – Patrick Jun 13 '23 at 15:41
  • Glad to hear that you got an idea, maybe you can post the same as an answer so other members who are facing this similar issue are helped out. – Sandeep Vokkareni Jun 14 '23 at 06:18

1 Answers1

0

Thanks @yedpodtrzitko for the idea. I spent several hours with different approaches and I still don't know how Cloud Run Jobs is supposed to work

"parent" container idea

The issue here is that Cloud Run Jobs execute doesn't take any arguments, so I can't just have a pre-defined job and pass on the arguments before running it, the entire job needs to be created with fixed arguments. This means even if I have a parent container to fetch the rows, I'd still need to first create the job and delete if afterwards.

Batch

Batch seems to be a little more flexible and would probably work for the use case, but I found the documentation a bit bare to start another investigation into a new product.

Cloud Functions & Tasks

In the end I decided to use a "parent" Cloud Function to retrieve the rows and then create Cloud Tasks to call a second Function to process the individual items. Tasks adds some control over concurrency to avoid rate limit errors etc.

I'm intrigued by Cloud Run Jobs, so I'll continue to look for problems that can be solved with that, but so far no luck.

Patrick
  • 7,903
  • 11
  • 52
  • 87