2

I have a class with a method foo(), and a dictionary with a bunch of objects of that class.

How can I parallelize the execution of foo() in my collection of objects?

class MyClass:
    def __init__(self,a):
        self.a = a
    def foo():
        self.a+=1
        
if __name__ == "__main__":
    
    #get a dictionary full of independent objects
    dic={}
    for kk in range(10):
        dic[kk]=MyClass(kk)
    
    #execute foo on each object. 
    #How to do it in a parallel way?
    for myObject in dic.values():
        myObject.foo()

pd: this is probably a silly question, really simple in other programing languages, but I did google on the internet and did not find any straight forward solution.

SandiaDeDia
  • 291
  • 2
  • 9
  • For a full Answer & performance details *kindly **see*** >>> https://web.archive.org/web/20201125045749/https://stackoverflow.com/questions/64963242/how-to-parallelize-a-class-method-in-python/64998753 – user3666197 Nov 25 '20 at 09:46

1 Answers1

0

You didn't provide a code sample so I'll give you some pseudo code but this should give you an idea of how to solve it:

from multiprocessing import Pool

class myClass:
    def foo(x):
        # Do your processing here
        return x

if __name__ == "__main__":

    # Your dictionary here
    some_dict = {"key1": {}, "key2": {}}

    # Create an instance of your class
    my_class = MyClass()

    # Create a pool of workers
    with Pool(5) as p:
        p.map(my_class.foo, [obj for obj in some_dict.values()])
Gijs Wobben
  • 1,974
  • 1
  • 10
  • 13
  • 1
    While the code is readable, it has an awful property of killing itself on trying to assemble a list from any large enough dicttionary, which will throw an exception of Memory Not Available. The same might survive, if using generators / iterators instead of using the list-comprehension. **Finally :** the overhead costs of spawning a Pool-of-processes plus the costs of passing SER/comms/DES- parameters'-instances towards the processing + of passing SER/comms/DES-returning objects can create an awfully large sum of all these overhead costs, that render this strategy a rather anti-pattern – user3666197 Nov 23 '20 at 08:26
  • To read more on this - one may kindly review the contemporary update to ***Amdahls Law*** where overhead costs and atomicity-of-processing were included into the original Amdahl's argument. --- https://stackoverflow.com/revisions/18374629/3 – user3666197 Nov 23 '20 at 08:28
  • @user3666197, sure it's not optimized in any way. A dict is probably not the most efficient way to store a large dataset anyway, but I was just answering the question... – Gijs Wobben Nov 23 '20 at 08:31
  • :o) sure, there was not a word against just answering the question. All remarks were related to the responsibility of using some code, to the dangers of its known properties of its O( n )-scaling in TimeDOMAIN and SpaceDOMAIN and related to the actual add-on costs, one will have to pay, if running the just proposed code in the real-world. Last but not least, I agree with the claim, that even more dangers are hidden inside the dictionary-of-objects, where a default multiprocessing serialiser (a pickle-based SER/DES) may & often does crash due to its limited capabilities to serialise some types – user3666197 Nov 23 '20 at 09:01
  • Windows-type O/S will have one more surprise, that the spawning will actually cost you a one full copy of the current-state of the Python interpreter, per each worker in the multiprocessing.Pool, so one may pay even such draconic add-on costs in full-copy RAM allocations, that many times the RAM-to-RAM copies' memory-I/O-s, soon perhaps starting swapping to occur, due to Pool-worker's times replicated RAM-footprint started to exceed the free-physical RAM available, so more bad news down the road. ***Be well,** @GijsWobben this is normal, this is just how the real-world works - stay tuned ;o)* – user3666197 Nov 23 '20 at 09:09
  • @GijsWobben doesn't your code use the same instance of the class to compute foo with the dictionary items as arguments? What I need is to parallelize the execution of foo over a bunch of instances in a dictionary. I edited the question for clarity. – SandiaDeDia Nov 24 '20 at 00:00