which mode is better when optimize matrix mul vector with mpi？ size>6000

Question

normally when dealing with matrix a[size][size] mul vector b[size], nump = the number of processors. which way below is better? case 1: Divide matrix A into "nump" parts, each with "size/nump" rows, and let each process handle all the rows in one part. case 2: Using row distribution, whenever a process is idle, an unprocessed row of matrix A is distributed to that process for computation. i'm bothered by the cost of send/recv/broadcast. is there any proportional relation between size and the communication cost? or is there any way to predict the time complexity? or i could only use tools like vtune to test?

i'm bothered by the cost of send/recv/broadcast. is there any proportional relation between size and the communication cost? or is there any way to predict the time complexity? or i could only use tools like vtune to test?

score 0 · Answer 1 · answered Aug 09 '23 at 16:37

If you're using MPI, you should never create the matrix on one process and then distribute it to the others. That is a time and memory bottleneck. Create it distributed right from the get-go.

About your dynamic scheme: since each row is the same length the work is very evenly distributed and you should prefer a static scheme over a dynamic one.

which mode is better when optimize matrix mul vector with mpi？ size>6000

1 Answers1