0

I was having a problem with some more complex code but have reduced it to something that reproduces the issue:

from multiprocessing import Manager
from concurrent.futures import ProcessPoolExecutor
from itertools import repeat

def process(v):
    v.value += 1

def main():
    with Manager() as manager:
        v = manager.Value('i', 0)
        with ProcessPoolExecutor() as executor:
            executor.map(process, repeat(v, 10))
        print(v.value)

if __name__ == '__main__':
    main()

The value I expect to be printed is 10.

But here's the issue... It is 10 most of the time but sometimes it's 9 or even 8.

From the documentation:- "A manager object returned by Manager() controls a server process which holds Python objects and allows other processes to manipulate them using proxies"

This behaviour suggests that the "server process" is unreliable in my environment.

I'm guessing that this is either a bug in my version of Python or an issue that's peculiar to macOS.

Environment details: Python 3.10.2, macOS 12.2.1, 3 GHz 10-Core Intel Xeon W, 32gb RAM

Or maybe I've fundamentally misunderstood something

DarkKnight
  • 19,739
  • 3
  • 6
  • 22

1 Answers1

1

Looks like this operation isn't atomic (see docs). Using a Lock seems to do the trick:

from concurrent.futures import ProcessPoolExecutor
from itertools import repeat
from multiprocessing import Manager


def process(args):
    v, lock = args
    with lock:
        v.value += 1


def main():
    with Manager() as manager:
        lock = manager.Lock()
        v = manager.Value('i', 0)
        with ProcessPoolExecutor() as executor:
            executor.map(process, repeat((v, lock), 10))
        print(v.value)


if __name__ == '__main__':
    main()
Stefan Falk
  • 23,898
  • 50
  • 191
  • 378
  • I'm aware that utilising a discrete lock solves the problem but the point of this question is that Manager is supposed to manage synchronisation/locking. If it doesn't then what is its purpose? – DarkKnight Mar 16 '22 at 10:13
  • @ArthurKing From the top of my head: You don't want to lock every time that you read/write a value. With `manager.Value()` you can have a dedicated non-locking increment process which all other processes can read its value from. Your example is a different scenario where all processes need to read and write. Apparently, in this case you need an explicit lock. – Stefan Falk Mar 16 '22 at 10:30
  • @ArthurKing No problem. Maybe somebody else can elaborate this a bit further but I think it makes sense to not lock on default. Shared memory is always quite tricky unfortunately. – Stefan Falk Mar 16 '22 at 10:42