I have a C++ code using MS MPI (using Boost MPI). Usually I run it using Windows HPC Pack cluster (12 nodes, each have 32 cores). It has no problem running with one, two, or four nodes. But when I try to use 12 nodes to run, it runs for some time and eventually fails (every time, not succeeding once). The Error message from output is like this:
job aborted:
[ranks] message
[0] process exited without calling finalize
[1-383] terminated
---- error analysis -----
[0] on XXXXX
Model.exe ended prematurely and may have crashed. exit code 0xc0000409
---- error analysis -----
The output from that error is not readable, something like below:
A
A
s
A
s
s
e
s
r
s
t
s
A
e
i
e
A
r
o
r
A
t
n
t
s
A
i
A
A
A
i
f
o
s
o
A
s
A
n
s
A
A
A
a
s
s
n
s
A
s
A
s
s
s
i
A
A
s
A
A
s
If you can give any suggestions on debugging this, that will be great. Thanks