Parallelization of Python code on different machines on different networks

Question

I’m looking to use parallelized code across two computers on different networks to execute a batch of tasks, but am not sure how to do so in Python.

Suppose I have two computers, Computer A and Computer B on two different networks, and I have a batch of 100 tasks to be accomplished. Naively, I could assign Computer A and Computer B to each do 50 tasks, but if Computer A finishes its tasks before Computer B, I would like Computer A to take on some of Computer B’s remaining tasks. Both computers should return the results of their tasks to my local machine. How can this be done?

DUDE_MXP · Accepted Answer · 2022-05-29T09:45:30.330

You need to create a distributed queue which can work across different networks. Something like rabbit-mq
Put all yours tasks in the queue.
Create a central worker management tool that let's you create and manage workers on Computer A and Computer B. Workers will process your tasks.
You also need to take care availability of workers to achieve what you said - if Computer A finishes its tasks before Computer B, I would like Computer A to take on some of Computer B’s remaining tasks

Luckily, python has an excellent library "Celery" which let's you achieve exactly what you want. It's a well documented library and has a large and diverse community of users and contributors. You just need to setup a broker (or queue) and configure celery.

There are lots of features in Celery that you can use as per your requirement - Monitoring/Scheduling jobs/Celery canvas to name a few.

https://docs.celeryproject.org/en/stable/getting-started/introduction.html https://medium.com/swlh/python-developers-celery-is-a-must-learn-technology-heres-how-to-get-started-578f5d63fab3

Parallelization of Python code on different machines on different networks

1 Answers1