I have developed a MPI application using Java and MPJ Expresss. It works perfectly in the multi-core configuration.
Recently, it was given to my access to the distributed memory environment in order to test my application. First, I did the MPJ HelloWorld application to check that the cluster configuration was working well. After that, I proceeded to run my application, but it freezes after showing:
MPJ Express (0.38) is started in the cluster configuration
To make things worse, I killed the process with Ctrl+C and I couldn't run the HelloWorld application. I had to kill the MPJ daemon in all machines and start them again.
I even replace the content of my main class with the same content of the HelloWorld class to see if it printed something. It didn't. Also, I created a HelloWorld application with a similar package structure to my application and the HelloWorld worked great.
One of the big differences I can notice between the HelloWorld and my application is that I have a set of libraries which size is around 29.8 MB, so I tried not adding the libraries to the execution class path. It worked, but of course, my application is useless because it can't find its required libraries at run time.
I would appreciate any comments and advices.
Thanks!