1

I’m looking to use parallelized code across two computers on different networks to execute a batch of tasks, but am not sure how to do so in Python.

Suppose I have two computers, Computer A and Computer B on two different networks, and I have a batch of 100 tasks to be accomplished. Naively, I could assign Computer A and Computer B to each do 50 tasks, but if Computer A finishes its tasks before Computer B, I would like Computer A to take on some of Computer B’s remaining tasks. Both computers should return the results of their tasks to my local machine. How can this be done?

user3666197
  • 1
  • 6
  • 50
  • 92

1 Answers1

2
  • You need to create a distributed queue which can work across different networks. Something like rabbit-mq
  • Put all yours tasks in the queue.
  • Create a central worker management tool that let's you create and manage workers on Computer A and Computer B. Workers will process your tasks.
  • You also need to take care availability of workers to achieve what you said - if Computer A finishes its tasks before Computer B, I would like Computer A to take on some of Computer B’s remaining tasks

Luckily, python has an excellent library "Celery" which let's you achieve exactly what you want. It's a well documented library and has a large and diverse community of users and contributors. You just need to setup a broker (or queue) and configure celery.

There are lots of features in Celery that you can use as per your requirement - Monitoring/Scheduling jobs/Celery canvas to name a few.

https://docs.celeryproject.org/en/stable/getting-started/introduction.html https://medium.com/swlh/python-developers-celery-is-a-must-learn-technology-heres-how-to-get-started-578f5d63fab3

DUDE_MXP
  • 724
  • 7
  • 24