1

I currently have my model running with a discrete grid projection on 4 processes. I create the grid in the following way:

    std::vector<int> processDimensions;
processDimensions.push_back(2);
processDimensions.push_back(2);

// The grid projection will contain agents of type VirusCellInteractionAgents, so that it can facilitate all agents types
// Then we can use the agent type identifier in each agent ID, to cast them to the correct type of agent.
discreteGridSpace = new repast::SharedDiscreteSpace<VirusCellInteractionAgents, repast::WrapAroundBorders, repast::SimpleAdder<VirusCellInteractionAgents>>("AgentsDeiscreteSpace", gridDimensions, processDimensions, 2, comm);

I wanted to try and run the model on 8 or 16 processes, so I was wondering, what should the processDimensions be in such case. I attempted keeping it 2 on each axis, as it was initially, however that results in the following error just after the first grid balance() call

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   PID 71163 RUNNING AT Aleksandars-MBP
=   EXIT CODE: 11
=   CLEANING UP REMAINING PROCESSES
=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault: 11 (signal 11)

1 Answers1

0

The dimensions should multiply to the number of processes. So, 4x2 for 8 or 4x4 for 16.

Nick Collier
  • 1,786
  • 9
  • 10
  • Hi, yes I also tried doing 4x4 and running on 16 processes with grid of size 100x100, so 25x25 for each process. However, I still get the following error, when I call grid->balance(). =================================================================================== = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES = PID 14164 RUNNING AT Aleksandars-MBP = EXIT CODE: 11 = CLEANING UP REMAINING PROCESSES = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES =================================================================================== – Alexandar Ruskov Mar 04 '22 at 12:53
  • I continued trying with different combinations of process dimensions: (2x1) and run on 2, (4x2 ) and run on 8, (4x4) and run on 16, however I continue getting the exact same error after the balance call(). The only combination which seems to work is the 2x2 and running on 4 processes. Is there any other setting apart from the process dimensions that I need to change or anything to make it work? – Alexandar Ruskov Mar 09 '22 at 11:22
  • This should work. Is there anything else in the code where you make an assumption about the number of ranks? Maybe print out the number of ranks just to make sure that it is what you think? You can also try to debug this with gdb. See https://www.open-mpi.org/faq/?category=debugging#serial-debuggers. Make sure to compile with debug flags first though. – Nick Collier Mar 09 '22 at 17:03
  • Yeah, I print out all of the bounds of the grid and I get 16 prints. It is at the balance() call where it runs into the error. One more interesting thing is that I decided to try and run the 16 processes on the Repast HPC tutorial by doing the exact same thing. Changed the process dimensions to 4 and 4, ran on 16 processes and even the provided tutorial failed in the exact same way! – Alexandar Ruskov Mar 10 '22 at 09:00
  • Sorry for the multiple comments but I keep running out of characters. The repro of the issue with the Tutorial code (More specifically Tutorial_03_Step_03): 1) Edit the Demo_03_Model.cpp by changing the process dimensions vector to contain any combination (e.g 2;1). 2) Compile the edited Demo_03 code and run with 2 processes. 3) Error triggers at the balance() call. – Alexandar Ruskov Mar 10 '22 at 10:07
  • I'm having some trouble reproducing this. Let's move the conversation to a github issue here: https://github.com/Repast/repast.hpc/issues/8, as it doesn't seem appropriate for stack overflow anymore. – Nick Collier Mar 11 '22 at 20:01