0

I'm looking for a solution to do multiprocessing for running script. I have a function which launches 4 process, and each process executes a script through runpy.run_path() and I get return back.

Example :

def valorise(product, dico_valo):
    res = runpy.run_path(product +"/PyScript.py", run_name="__main__")
    dico_valo[product] = res["ret"]

def f(mutex,l,dico):
    while len(l)!= 0:
        mutex.acquire()
        product = l.pop(0)
        mutex.release()
        p = Process(target=valorise, args=(product,dico))
        p.start()
        p.join()

def run_parallel_computations(valuationDate, list_scripts):
    if len(product_list)>0:
        print '\n\nPARALLEL COMPUTATIONS BEGIN..........\n\n'
        manager = Manager()
        l = manager.list(list_scripts)
        dico = manager.dict()
        mutex = Lock()
        p1 = Process(target=f, args=(mutex,l,dico), name="script1")
        p2 = Process(target=f, args=(mutex,l,dico), name="script2")
        p3 = Process(target=f, args=(mutex,l,dico), name="script3")
        p4 = Process(target=f, args=(mutex,l,dico), name="script4")
        p1.start()        
        p2.start()
        p3.start()
        p4.start()
        p1.join()
        p2.join()
        p3.join()
        p4.join()
        dico_isin = {}
        for i in iter(dico.keys()):
            dico_isin[i] = dico[i]
        return dico
        print '\n\nPARALLEL COMPUTATIONS END..........'
    else:
        print '\n\nNOTHING TO PRICE !'

In every PyScript.py, I import a library and each script has to import again it. However, in this case, it doesn't work as I want and I don't understand why. My library is imported once during the first process and the same "import" is used in the other processes. Could you help me ?

Thank you !

EntrustName
  • 421
  • 6
  • 19
  • What do you mean - "it doesn't work". Does your pc spontaneously combust, or create blackholes, or generate error messages ? I think we will need more to go on. It could be something as simple as a sys.path issue - but who knows ? – Tony Suffolk 66 Jun 12 '14 at 16:31
  • Sorry, I edited. "My library is imported once during the first process and the same "import" is used in the other processes." In this library, I have global variables and I want to reset them during each "runpy.run_path". (Of course, I cannot edit this library...) – EntrustName Jun 12 '14 at 16:40
  • 1
    Are you expecting these global variables to be shared between the processes since all processes import the same library module? If you are then that won't happen - you will need to use a different system (shared memory for is – Tony Suffolk 66 Jun 12 '14 at 16:44
  • No, the opposite. I expected each process have their own global variables but it is not the case. As though I was using `os.system()` to run my script. – EntrustName Jun 13 '14 at 08:24

1 Answers1

0

It might not be the case in multiprocessing (but looks like it is). When you will try to import something more than once (ie. import re in most of your modules), Python will not 'reimport' it. As it will see it in modules already imported and will skip it.

To force reloading you can try reload(module_name) (it can not reload import of single class/method from module, you can reload whole module or nothing)

Fuxi
  • 5,298
  • 3
  • 25
  • 35
  • It's strange, but in the case of multiprocessing, Python imports once. So my global variables in the library are shared and I don't want that. Each process needs to have their own global variables. I could use `reload` but im lookin for a more elegant solution. To sum up, I would like to have the same behavior than I am using `os.system()` – EntrustName Jun 13 '14 at 08:20
  • Would it be possible to use classes (instances) to encapsulate variables in it, rather than using global variables? – Fuxi Jun 14 '14 at 12:06
  • Yes it would be possible but we cannot edit the sources of the library – EntrustName Jun 16 '14 at 08:31