0

I have been racking my brains on this for a few weeks now, trying different variations from Google Cloud service offerings but can't seem to find the proper one.

I have a python script with dependencies etc, that I have containerized, pushed, and deploy to GCR.

The script is a bot that connects to an external websocket receiving signals perpetually to then do other processing via API against another external service.

What would be the best service offering from Google Cloud to run this?

So far, I've tried:

GCR Web Service - requires listening service (:8080) which I do not provide in this use case, and, it scales down your service when there is no traffic so no go.

GCR Job Service - Seems like the next ideal option (no HTTP port requirement) - however, since the script (my entry point), upon launch, doesn't 'return' unless it quits, the service launch just allows it to run for a minute or so, until the jobs API declares it as 'failed' - basically, it is launching it via my entry point which just executes the script as if it was running locally and my script isn't meant to return anything.

To try and get around this, I went the google's recommended way and built a main.py with they standard boilerplate, and built it as a wrapper to act as a launcher for the actual script. I did this via a simple subprocess.Popen using their sample main.py as shown below.

main.py

import json
import os
import sys
import subprocess


# Retrieve Job-defined env vars
TASK_INDEX = os.getenv("CLOUD_RUN_TASK_INDEX", 0)
TASK_ATTEMPT = os.getenv("CLOUD_RUN_TASK_ATTEMPT", 0)


# Define main script
def main():
    print(f"Starting Task #{TASK_INDEX}, Attempt #{TASK_ATTEMPT}...")

    subprocess.Popen(["python3", "myscript.py"], stdout=subprocess.PIPE, stderr=subprocess.PIPE)

    print(f"Completed Task #{TASK_INDEX}.")



# Start script
if __name__ == "__main__":
    try:
        main()
    except Exception as err:
        message = f"Task #{TASK_INDEX}, " \
                  + f"Attempt #{TASK_ATTEMPT} failed: {str(err)}"

        print(json.dumps({"message": message, "severity": "ERROR"}))
        sys.exit(1)  # Retry Job Task by exiting the process

My thinking being, this would allow the job to execute my script and mark the job as completed, while the actual script remains running. Also, since subprocess.Popen sets its stdout and stderr to PIPE, my thinking is it would get caught by the google logging and I would see the output.

The job runs and marks it as succeed, however, I see no indication of the actual script executing anywhere.

I had similar issue with Google Cloud functions. Jobs seemed like an ideal option since I can run on their scheduler to make sure it is launching after saying, every hour (my script uses a lock file so it doesn't run again if running).

  • Am I just missing the point on how these cloud services run?

  • Do offerings like google cloud run jobs/functions, etc meant to execute jobs that return and quit until launched again by however scheduled?

  • Do I need to consider Google Computing engine as an option for this use case that is, a full running VM instead of stateless/serverless options?

  • I am trying to use this in a containerized, scalable as needed, fashion to both make my project portable and minimize costs as much as possible given the always running nature of the job.

Lastly, I know services like pythonanywhere as I am sure others, make this kinda stuff easier, but I would like to learn how to do this via standard cloud offerings like GCR, AWS, etc.

thanks for any insight / advice!

  • A Cloud Run job will be fitting for your use case, but I think that you should tweak your current approach in order to make it work. In your initial setup (without using subprocess), what was happening is that you hit the default `10min` limit of execution - you can raise it to 1h when you create the job. After you switched to use subprocess, if you return before waiting for the script (the subprocess created), you won't be able to see anything because Cloud Run will think that it's ready to be shut down and everything running inside will get killed. – bhito Feb 15 '23 at 11:48
  • I realize that is what is happening, but doesn’t answer my question then cloud run isn’t a viable option ? – JackieTreehorn Feb 15 '23 at 17:04
  • How long do you need to run your bot? Full time? – guillaume blaquiere Feb 15 '23 at 17:28
  • Yea i need it to run perpetually, essentially, a job that launches and is then checked again every 60 secs via trigger to ensure it’s still running. Since my bot uses a file lock, it would only execute again if found not running… – JackieTreehorn Feb 15 '23 at 18:46

1 Answers1

0

Cloud Run best fit is for HTTP Rest APIs serving (stateless services). There are also Jobs in beta.

One of the top feature of Run is that it scales to 0, when there are not requests to your service (your service instance gets totally destroyed).

If your bot needs to stay alive "for ever", Run is not for you... (Even if you can configure Run to keep at least one instance live).

I would consider instead AppEngine or Compute.

Fabio B.
  • 970
  • 2
  • 14
  • 29