0

The following code will complain NameError: name ‘b’ is not defined

from concurrent.futures import ProcessPoolExecutor
def add(a):
    return a+b

if __name__ == '__main__':
    b = 3
    with ProcessPoolExecutor() as executor:
        test = executor.submit(add,5)
    print(test.result())

But if you move b = 3 before the if statement as shown below, it will run without error, why?

from concurrent.futures import ProcessPoolExecutor
def add(a):
    return a+b

b = 3
if __name__ == '__main__':
    with ProcessPoolExecutor() as executor:
        test = executor.submit(add,5)
    print(test.result())

And by the way, what's the best practice to define global variables like b in functions?

Tomoon
  • 91
  • 7
  • 2
    best practice is to not use globals as much as possible. That said, using globals for constant values is fairly typical. It is common practice to give constants all-uppercase names. as to your main question: please read up on how spawn works (vs fork) with python multiprocessing. It has been described many times by people more eloquent than myself. TLDR; child process gets access to things in "main" file by importing it. things in `if __main__ ..` don't get executed on import. It is important to keep though so that you don't get an infinite loop of child processes. – Aaron Aug 14 '21 at 02:43
  • Thanks for pointing me to the right direction! Much appreciated! @Aaron – Tomoon Aug 14 '21 at 02:56
  • Well adding to what @Aaron said. b is not a global in if __name__ == '__main__'. It is conditionally global or whatever you want to call it. If you put `b` in an else statement after if __name__ it will probably be recognized. – tchar Aug 14 '21 at 02:57
  • I find the following lines in python doc, not sure if it's referring to this problem? Global variables Bear in mind that if code run in a child process tries to access a global variable, then the value it sees (if any) may not be the same as the value in the parent process at the time that Process.start was called. However, global variables which are just module level constants cause no problems. – Tomoon Aug 14 '21 at 03:23
  • not quite your problem, but possibly one you could run into, it is not the same variable, it was simply re-created with the same value so the child has a copy of its own. If a child modifies a global variable, only its copy gets modified. The "global" - ness is only within the process. – Aaron Aug 14 '21 at 03:54

1 Answers1

1

The comments by @Aaron and @tchar are quite fitting. But I did want to point out that there is a way of initializing each process in your multiprocessing pool to have whatever global variables you need by using the initializer and initargs arguments on the Pool constructor:

from concurrent.futures import ProcessPoolExecutor

def init_pool(b_value):
    # Initialize pool processes global varibales:
    global b
    b = b_value

def add(a):
    return a+b

if __name__ == '__main__':
    b = 3
    with ProcessPoolExecutor(initializer=init_pool, initargs=(b,)) as executor:
        test = executor.submit(add,5)
    print(test.result())
Booboo
  • 38,656
  • 3
  • 37
  • 60