Chapel - Problems With Multilocale Configuration of GASNET MPI substrate

Question

I have a forall code with distributed iterators in Chapel and I'm trying to run it on a Cluster.

The code runs perfectly when using the UDP conduit.

Now, I'm trying to use the portable MPI as an internal Layer - with no success.

Here is my configuration:

export CHPL_TASKS=qthreads

export CHPL_COMM=gasnet

export CHPL_COMM_SUBSTRATE=mpi

export CHPL_LAUNCHER=gasnetrun_mpi

with only this configuration only one node was used. Looking at Gasnet documentation, I added:

export GASNET_NODEFILE="$(pwd)"/nodes

export MPIRUN_CMD='mpirun -np %N -machinefile %H %C'

(these details are missing in the official documentation).

Ok, now I can run Chapel code using MPI. BUT:

1) Each node has 32 cores. If I put hello6 -nl x, x < 33, all processes are executed by the first locale.

1.1) I would like to run hello6 -nl 4, so each node would say hello from locale x, adress x.address.

2) Looks like Chapel uses the $OAR_NODEFILE (maybe another) to create the Locales vector, because this OAR_NODEFILE has one entry per core for each node.

3) However, even if I change manually both $GASNET_NODEFILE and $OAR_NODEFILE the Locale vector still contains one entry per core for each CPU node.

4) In the cluster I have access, I run mpi codes like this: mpirun -machinefile $OAR_NODEFILE ~/program. However, GASNET requires the syntax of the last variable exported.

Can anyone help me configuring the runtime for executing my code on multiple Locales?

Best regards,

Tiago Carneiro.

score 3 · Accepted Answer · answered Dec 03 '18 at 16:29

3

Assuming you're using the Chapel 1.18 release and Open MPI (let me know if that's not true.) There was a bug in Chapel 1.18 and earlier where when using Open MPI all Chapel instances were packed onto a single node first. This has been fixed on master (https://github.com/chapel-lang/chapel/pull/11546) and the fix will be included in the 1.19 release.

You could try using git master, or you might be able to set MPIRUN_CMD="mpirun --bind-to none --map-by ppr:1:node -np %N %P %A" as a workaround.

answered Dec 03 '18 at 16:29

Elliot

381
1
4

Hello Elliot, Thank you. I'm now able to run my multilocale experiments using MPI. However, I had to make the following modifications to the `MPIRUN_CMD='mpirun --bind-to none --map-by ppr:1:node -np %N -machinefile %H %C'`. One thing I would like to point out is that the official documentation gives no info on how to set up the portable MPI conduit. The only info given - `GASNET_LAUNCHER=mpirun` - results in compilation errors when making Chapel. Morever, the need for exporting `MPIRUN_CMD` and `GASNET_NODEFILE` is also missing. Thank you again! – Tiago Carneiro Dec 04 '18 at 10:46
Good point -- I've opened https://github.com/chapel-lang/chapel/issues/11771 for improving the documentation. Note that in the next release you shouldn't have to set `MPIRUN_CMD`, that is a bug in the 1.18 release that has already been fixed upstream. – Elliot Dec 04 '18 at 15:37

Chapel - Problems With Multilocale Configuration of GASNET MPI substrate

1 Answers1