MPI: shared variable value for all processors

Question

Here's one question about MPI. I need two processors that keeps modifying one variable and I want both processors to have access to the variable with the most up-to-date value.

from mpi4py import MPI
from time import sleep

comm = MPI.COMM_WORLD
rank = comm.rank
assert comm.size == 2

msg = 0
sec = 10
if comm.rank == 0:
    for i in range(sec):
        print msg
        sleep(1)
        msg = comm.bcast(msg,root = 1)
else:
    for i in range(sec*2):
        msg += 1
        sleep(0.5)
        comm.bcast(msg,root = 1)

So I'm expecting the program to print something like: 0 2 4 ...

But the program turns out to print: 0 1 2 3 4 5 6 7 8 9

I'm curious if there's a mechanism in mpi4py such that the variable msg is shared by both processors? That is, whenever msg is modified by processor 1, the new value becomes immediately available to processor 0. In other words, I want processor 0 to access the most-up-to-date value of msg instead of waiting for every changes that were made on msg by processor 1.

Your code broadcasts the value from rank 1, which gets incremented by one in each iteration, hence the observed output. — Hristo Iliev, Jul 08 '13 at 22:17
Sorry for misleading. I've already changed the code. My question is, at time 1sec, rank 1 already increased msg to 2 but why rank 0 still print out msg as 1? I need a method to make rank 0 to accept msg with value 2 rather than 1. — zhh210, Jul 09 '13 at 00:13

score 5 · Answer 1 · answered Jul 09 '13 at 13:36

I think you're getting confused about how distributed memory programming works. In MPI, each process (or rank) has its own memory, and therefore when it changes values via load/store operations (like what you're doing with msg += 1), it will not affect the value of the variable on another process. The only way to update remote values is by sending messages, which you are doing with the comm.bcast() call. This sends the local value of msg from rank 1 to all other ranks. Until this point, there's no way for rank 0 to know what's been happening on rank 1.

If you want to have shared values between processes, then you probably need to take a look at something else, perhaps threads. You'll lose the distributed abilities of MPI if you switch to OpenMP, but that might not be what you needed MPI for in the first place. There are ways of doing this with distributed memory models (such as PGAS languages like Unified Parallel C, Global Arrays, etc.), but you will always run into the issue of latency which means that there will be some time that the values on ranks 0 and 1 are not synchronized unless you have some sort of protection to enforce it.

Rank 1 broadcasts 20 times but rank 0 only receives the first 10 updates. This is not desired in my problem. I understand that rank 0 only updates msg for ten times, but why it accepts the first 10 updates made by rank 1? Is there a queue so that rank 0 always follows the order msg is send to the queue? — zhh210, Jul 09 '13 at 15:02

score 2 · Answer 2 · edited May 23 '17 at 11:54

As mentioned by Wesley Bland, this isn't really possible in a pure distributed memory environment, as memory isn't shared.

However, MPI has for some time (since 1997) allowed something like this in the MPI-2, as one-sided communications; these have been updated significantly in MPI-3 (2012). This approach can have real advantages, but one has to be a little careful; since memory isn't really shared, every update requires expensive communications and it's easy to accidentally put significant scalability/performance bottlenecks in your code by over-reliance on shared state.

The Using MPI-2 book has an example of implementing a counter using the MPI-2 one-sided communications; a simple version of that counter is described and implemented in this answer in C. In the mpi4py distribution, under 'demos', there are implementations of these same counters in the 'nxtval' demo; the same simple counter as nxtval-onesided.py and a more complicated but more scalable implementation, also as described in the Using MPI-2 book, as nxtval-scalable.py. You should be able to use either of those implementations more or less as-is in the above code.

Thanks, the example is very useful for me. I think my problem can be solved by the one-sided communications of MPI. — zhh210, Jul 09 '13 at 15:31

MPI: shared variable value for all processors

2 Answers2