0

I'm trying to run some Sum Product Network code found at http://alchemy.cs.washington.edu/spn/

When I try to run this on my mac (ver 10.8.4) , I run into the following error:

mpjrun.sh -np 1 eval.Run -d O
MPJ Express (0.40) is started in the multicore configuration
[Rank=0] *** Parameters ***
[Rank=0]    domain=O
[Rank=0]    numSumPerRegion=20
[Rank=0]    numComponentsPerVar=4
[Rank=0]    sparsePrior=1.0
[Rank=0]    baseResolution=4
[Rank=0]    numSlavePerClass=50
[Rank=0]    numSlaveGrp=1
[Rank=0] <TIME> init 1687 ms

mpjdev.MPJDevException: In Comm.irecv(), requested source 1 does not exist in 

communicator of size 1
        at mpjdev.Comm.recv(Comm.java:864)
        at mpi.Comm.recv(Comm.java:1294)
        at mpi.Comm.Recv(Comm.java:1255)
        at spn.SPN.recvUpdate(SPN.java:650)
        at spn.GenerativeLearning.learnHardEM(GenerativeLearning.java:52)
        at spn.GenerativeLearning.learn(GenerativeLearning.java:16)
        at eval.Run.runOlivetti(Run.java:147)
        at eval.Run.proc(Run.java:46)
        at eval.Run.main(Run.java:40)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at runtime.starter.MulticoreStarter$1.run(MulticoreStarter.java:277)
        at java.lang.Thread.run(Thread.java:744)
    java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at runtime.starter.MulticoreStarter$1.run(MulticoreStarter.java:277)
        at java.lang.Thread.run(Thread.java:744)
    Caused by: mpi.MPIException: mpi.MPIException: mpjdev.MPJDevException: In Comm.irecv(), requested source 1 does not exist in communicator of size 1
        at mpi.Comm.Recv(Comm.java:1259)
        at spn.SPN.recvUpdate(SPN.java:650)
        at spn.GenerativeLearning.learnHardEM(GenerativeLearning.java:52)
        at spn.GenerativeLearning.learn(GenerativeLearning.java:16)
        at eval.Run.runOlivetti(Run.java:147)
        at eval.Run.proc(Run.java:46)
        at eval.Run.main(Run.java:40)
        ... 6 more
    Caused by: mpi.MPIException: mpjdev.MPJDevException: In Comm.irecv(), requested source 1 does not exist in communicator of size 1
        at mpi.Comm.recv(Comm.java:1317)
        at mpi.Comm.Recv(Comm.java:1255)
        ... 12 more
    Caused by: mpjdev.MPJDevException: In Comm.irecv(), requested source 1 does not exist in communicator of size 1
        at mpjdev.Comm.recv(Comm.java:864)
        at mpi.Comm.recv(Comm.java:1294)
        ... 13 more

This happens for any value of np I give. I'm assuming this is not an issue with the SPN code but rather something I am doing with MPJ-Express. I have tried both version 0.40 and 0.37 for MPJ-Express and get the same result.

Thanks for your time.

  • Have you checked that your MPJ Express passes the testsuit? – Bibrak Apr 04 '14 at 12:01
  • That error seems to be called by the following line in **src/mpjdev/comm.java** `else if (src >= this.size() && src != -2) { throw new MPJDevException("In Comm.iprobe(), requested source " + src + " does not exist in communicator of size "+this.size()); }` When np = 1 there can't exist src = 1 – Bibrak Apr 04 '14 at 12:05

1 Answers1

0

I have the same problem when I run the code and found the answer in the SPN user guide. The command to run SPN is:

mpjrun.sh -np [NUM_PROCESSOR] -dev niodev -mx8000m eval.Run [SPN OPTIONS] > [LOG FILE]

where NUM_PROCESSOR depends on the number of slave processes for each image category, and the number of slave groups. It should equal to (numSlavePerCat + 1) × numSlaveGroup, numSlavePerCat and numSlaveGroup can be found in common/Parameter.java. If you want to run in your machine which do not have so much processors, you can modify the numSlavePerCat.