2

On the server-side: I need a way to execute some tasks in the background, frequently and start it at a specific time. My programming language is Python for the back-end(Sanic Framework), VueJs for the front-end, MongoDB as main DB and the Redis for caching. Also, I'm using a Docker container(docker-compose). Also, I worked before with the Celery but I want to know what is the best solution for production that guarantees it's stable and reliable.

On the client-side: For the mentioned question, I need to run it on the server-side, sometimes I need to run a job scheduler on clients, embedded devices such as Raspberry Pi that could run Python or JavaScript.

So, What are your solutions for these use cases?

Ali Hallaji
  • 3,712
  • 2
  • 29
  • 36
  • Could you please add some clarification to your note on the client side architecture? I'm confused if you're just looking for a cron-like solution for your python backend, or if you're also looking for some sort of JavaScript scheduling API for your front-end –  Mar 30 '20 at 15:44
  • @ChefCyanide I'm looking for a `cron-like` solution. – Ali Hallaji Mar 30 '20 at 15:48

3 Answers3

4

In production we have both long and short-running tasks and in total our Celery cluster executes up to 6M tasks per day, so naturally I would recommend Celery. It is made for this purpose and if you are a Python developer you have another reason to pick Celery. Finally, Celery is the only Python task queue system known to me that has HA scheduler (https://github.com/mixkorshun/celery-beatx and https://github.com/sibson/redbeat).

There are two other (Python) projects that should be mentioned as alternatives to Celery - Huey (https://github.com/coleifer/huey) and Apache Airflow (https://github.com/apache/airflow).

DejanLekic
  • 18,787
  • 4
  • 46
  • 77
  • Keep in mind that if you execute tens of thousands of tasks (and more) per day, forget Airflow... - It is just not made for that kind of stuff. We do use Airflow too, but we offload most of the tasks to our (separate!) Celery cluster (Airflow workers just send tasks to the Celery cluster, and poll for results). – DejanLekic Mar 30 '20 at 19:13
  • Also, I need a simple(for routine jobs not very huge jobs) scheduler for clients like Raspberry PI. I think the `schedule` module was enough for that. – Ali Hallaji Mar 30 '20 at 19:19
4

I'm one of the core devs for Sanic. I would agree with the other answers that Celery is a great option. For anyone in need of a more light weight solution, I have a post about an alternative approach only inside Sanic: https://community.sanicframework.org/t/how-to-use-asyncio-queues-in-sanic/166/4

Adam Hopkins
  • 6,837
  • 6
  • 32
  • 52
0

Starting a new process in the background in python is as simple as calling os.fork(). For a comprehensive example, see https://python-course.eu/forking.php

EDIT:

For a fully featured solution, I'd recommend forking a background process as described above, and then using a library like https://github.com/dbader/schedule to execute jobs at scheduled intervals in that background process.

  • I'm looking for a complete library/solution with some features such execute every second or start a job at a specific time. I don't reinvent the wheel. – Ali Hallaji Mar 30 '20 at 15:21
  • I've edited my answer. It should meet your requirements now. –  Mar 30 '20 at 15:53
  • Forking, subprocess.Popen or multiprocessing.Process is easy. The hard part is getting them terminated (reliably) when the server is shut down or restarted. And doing that gracefully is nearly impossible (especially on Windows). – Tronic Apr 02 '20 at 13:51
  • On Linux (which based on the docker tag and the mention of a raspberry pi, I'd say is almost certainly what OP is dealing with) terminating a subprocess is as easy as sending a SIGKILL, and then just running `waitpid`. –  Apr 02 '20 at 13:55