-1

In the below code, the print happens once per processes including the main process, for as shown in the example I have 2 processes and the statement is printed 3 times. I expected it to run only once since every subprocess should only execute the target function.

Second issue, accessing variables doesn't seem consistent to me, as I can access the dic variable and increment its value, but I get an error trying doing so with the variable number.

import concurrent.futures as cf

print("Not inside of the target funtion!")
num = 0
dic = {"X":1, "Y":0}

def target(n):
    dic["X"] += 1
    dic[n] = n
    print(dic)
    # print(dic["X"])
    try:
        print(num)
        num += 1
    except Exception as e:
        print(e)

if __name__ == '__main__':
    with cf.ProcessPoolExecutor(2) as ex:
        ex.map(target, range(3))
    print(dic)
# Output 
# Not inside of the target funtion!
# Not inside of the target funtion!
# {'X': 2, 'Y': 0, 0: 0}
# local variable 'num' referenced before assignment
# {'X': 3, 'Y': 0, 0: 0, 1: 1}
# local variable 'num' referenced before assignment
# {'X': 4, 'Y': 0, 0: 0, 1: 1, 2: 2}
# local variable 'num' referenced before assignment
# Not inside of the target funtion!
# {'X': 1, 'Y': 0}
Marsilinou Zaky
  • 1,038
  • 7
  • 17
  • The script is being read and executed multiple times, once for each process including the main one. Try moving the `print("Not inside of the target funtion!")` to inside the `if __name__ == '__main__':` guard to verify this. – martineau Jan 05 '20 at 00:32
  • @martineau Yes, I've tried that, but the question here why is it behaving this way as I'm calling the target function only and not the main script – Marsilinou Zaky Jan 05 '20 at 00:34
  • 2
    It happens because each process runs in its own memory-space, so there are no shared global variables. Python works around that by running a separate instance of the interpreter in each one, and each of those needs to reimport the script. That's why the `if __name__ == '__main__'` conditional is required. You're going to also discover that each process has only updated its own copy of `dic`. – martineau Jan 05 '20 at 00:39
  • _why is it behaving this way as I'm calling the target function only and not the main script_ I'm not sure I understand what you mean, can you elaborate? As an aside, `except Exception as e: print(e)` is probably a bad practice. I suggest learning more about multiprocessing in Python, and the `concurrent.futures` library, before anything else. – AMC Jan 05 '20 at 01:55

1 Answers1

2

ex.map(target, range(3)) creates 3 tasks that require execution. True you only have 2 processes to run these, so the third task will simply wait for another task to complete before it can run. Processes in a pool are reusable. This is the whole point of a process pool. There is no point in having a process pool size greater than the number of processors on your computer because that number ultimately determines the level of multiprocessing that can be supported.

However, for each process execution, as @martineau has said, the code is re-imported, but the test if __name__ == '__main__' fails. This is why you see "Not inside of the target funtion!" printed 3 times but you don't get into a loop constantly launching 3 new processes.

You Need to Insert a global num Statement At the Start of Function target:

Variable num is a global variable. If all you are doing is reading the variable and not modifying it within a function, no action is required. Otherwise, you must declare it within the function as global num. You do not have the same problem with dic becuase you are not modifying dic. dic is a reference and you are modifying what dic refers to, i.e a value in a dictionary whose key is "X".

Booboo
  • 38,656
  • 3
  • 37
  • 60
  • Good point about `num` being a variable local to the function — I totally spaced on that aspect of the output in my comments under the question. That said, IMO using so much **Bold** emphasis and CamelCase words in your answer don't enhance it. – martineau Jan 05 '20 at 01:15
  • 1
    @martineau The OP had asked two questions and the bold was meant to be captions to the two answers with more detailed descriptions following. Sorry, I will tone it down. – Booboo Jan 05 '20 at 01:25
  • Thank you @Booboo and @martineau I'm now clear about the process print statement being displayed multiple times, but I'm not too clear about the variables, since when I added `dic[n] = n` in the `target` function, all the process shared the same `dic` variable but when printing it at the end it's not modified, I will update the code to the OP to show that – Marsilinou Zaky Jan 05 '20 at 02:23
  • @Marsilinou: Each process has its own copy of `num` and `dic` initialized to the same values. These values are only modified in the processes where the target function gets run, but those changes are not visible in the main process because its own copies aren't affected. As I said, each process runs in it's own memory-space — so they don't share global variables. – martineau Jan 05 '20 at 07:04