How do I run a serverless batch job in Google Cloud

Question

I have a batch job that takes a couple of hours to run. How can I run this in a serverless way on Google Cloud?

AppEngine, Cloud Functions, and Cloud Run are limited to 10-15 minutes. I don't want to rewrite my code in Apache Beam.

Is there an equivalent to AWS Batch on Google Cloud?

What is the framework of your model? What are the type of data? and where the data come from? What is your job kind? — guillaume blaquiere, Sep 30 '19 at 09:03

score 11 · Answer 1 · edited Aug 10 '22 at 18:35

Note: Cloud Run and Cloud Functions can now last up to 60 minutes. The answer below remains a viable approach if you have a multi-hour job.

Vertex AI Training is serverless and long-lived. Wrap your batch processing code in a Docker container, push to gcr.io and then do:

gcloud ai custom-jobs create \
  --region=LOCATION \
  --display-name=JOB_NAME \
  --worker-pool-spec=machine-type=MACHINE_TYPE,replica-count=REPLICA_COUNT,executor-image-uri=EXECUTOR_IMAGE_URI,local-package-path=WORKING_DIRECTORY,script=SCRIPT_PATH

You can run any arbitrary Docker container — it doesn’t have to be a machine learning job. For details, see:

https://cloud.google.com/vertex-ai/docs/training/create-custom-job#create_custom_job-gcloud

Today you can also use Cloud Batch: https://cloud.google.com/batch/docs/get-started#create-basic-job

score 5 · Answer 2 · answered Sep 30 '19 at 20:45

Google Cloud does not offer a comparable product to AWS Batch (see https://cloud.google.com/docs/compare/aws/#service_comparisons).

Instead you'll need to use Cloud Tasks or Pub/Sub to delegate the work to another product, such as Compute Engine, but this lacks the ability to do this in a "serverless" way.

score 4 · Accepted Answer · edited Jul 20 '22 at 22:22

4

Finally Google released (in Beta for the moment) Cloud Batch which does exactly what you want. You push jobs (containers or scripts) and it runs. Simple as that. https://cloud.google.com/batch/docs/get-started

edited Jul 20 '22 at 22:22

Lak

3,876
20
34

answered Jul 19 '22 at 21:16

Adrien QUINT

589
4
18

Main complaint is that you can't run these off a scheduler/timer yet. – Robert Moskal Jun 03 '23 at 16:25

mesmacosta · Answer 4 · 2023-05-29T22:45:45.227

I have faced the same problem. in my case I went for:

Cloud Scheduler to start the job by pushing to Pub/Sub.
Pub/Sub triggers Cloud Functions.
Cloud Functions mounting a Compute Engine instance.
Compute Engine runs the batch workload and auto kills the instance once it’s done. You can read my post on medium: https://link.medium.com/1K3NsElGYZ

It might help you get started. There's also a follow up post showing how to use a Docker container inside the Compute Engine instance: https://medium.com/google-cloud/serverless-batch-workload-on-gcp-adding-docker-and-container-registry-to-the-mix-558f925e1de1

score 2 · Answer 5 · answered Oct 28 '20 at 09:09

This answer to a How to make GCE instance stop when its deployed container finishes? will work for you as well:

In short:

First dockerize your batch process.
Then, create an instance:
- Using a container-optmized image
- And using a Startup script that pulls your docker image, runs it, and shutdown the machine at the end.

score 1 · Answer 6 · answered Oct 28 '20 at 07:59

You can use Cloud Run. At the time of writing this, the timeout of Cloud Run (fully managed) is increased to 60 minutes, but in beta.

https://cloud.google.com/run/docs/configuring/request-timeout

Important: Although Cloud Run (fully managed) has a maximum timeout of 60 minutes, only timeouts of 15 minutes or less are generally available: setting timeouts greater than 15 minutes is a Beta feature.

score 0 · Answer 7 · answered Nov 18 '21 at 16:55

0

Another alternative for batch computing is using Google Cloud Lifesciences.

An example application using Cloud Life Sciences is dsub.

Or see the Cloud Life Sciences Quickstart documentation.

answered Nov 18 '21 at 16:55

indraniel

577
1
4
8

Dylan Trotter · Answer 8 · 2022-05-20T20:46:45.463

I found myself looking for a solution to this problem and built something similar to what mesmacosta has described in a different answer, in the form of a reusable tool called gcp-runbatch.

If you can package your workload into a Docker image then you can run it using gcp-runbatch. When triggered, it will do the following:

Create a new VM
On VM startup, docker run the specified image
When the docker run exits, the VM will be deleted

Some features that are supported:

Invoke batch workload from the command line, or deploy as a Cloud Function and invoke that way (e.g. to trigger batch workloads via Cloud Scheduler)
stdout and stderr will be piped to Cloud Logging
Environment variables can be specified by the invoker, or pulled from Secret Manager

Here's an example command line invocation:

$ gcp-runbatch \
  --project-id=long-octane-350517 \
  --zone=us-central1-a \
  --service-account=1234567890-compute@developer.gserviceaccount.com \
  hello-world
Successfully started instance runbatch-38408320. To tail batch logs run:
CLOUDSDK_PYTHON_SITEPACKAGES=1 gcloud beta --project=long-octane-350517
logging tail 'logName="projects/long-octane-350517/logs/runbatch" AND
resource.labels.instance_id="runbatch-38408320"' --format='get(text_payload)'

score 0 · Answer 9 · answered Nov 30 '22 at 01:45

GCP launched their new "Batch" service in July '22. It basically Compute Engine packaged with some utilities to easily productionize a batch job -- including defining required resources, executables (script or container-based), and define a run schedule.

Haven't used it yet, but seems like a great fit for batch jobs that take over 1 hr.

How do I run a serverless batch job in Google Cloud

9 Answers9