I have a c++ code that can be run in parallel but with shared memory methods. I linked the code to PETSc and PETSc is able to run the code in parallel but with distributed memory method. When I run the code (c++ linked with PETSc) in parallel, it seems that all the processors are repeating a same job. For instance, when the number of processors are 4, the boundary conditions and initial condition are being read 4 times, or if I use a printf command, something will be printed 4 times. So it means that the job is not being distributed between the processors and all the processors are doing the whole job instead of doing some parts of it. Does anyone has a same experience and what are your suggestions in order to solve this problem? For example, below you can see that code is reading the mesh twice instead of reading it once:
reading mesh file Mesh_cavity2d.txt:
reading mesh file Mesh_cavity2d.txt:
or: