I am trying to mess around with matrices in python, and wanted to use multiprocessing to processes each row separately for a math operation, I have posted a minimal reproducible sample below, but keep in mind that for my actual code I do in-fact need the entire matrix passed to the helper function. This sample takes literally forever to process a 10,000 by 10,000 matrix. Almost 2 hours with 9 processes. Looking in task manage it seems only 4-5 of the threads will run at any given time on my cpu, and the application never uses more than 25%. I've done my absolute best to avoid branches in my real code, though the sample provided is branchless. It still takes roughly 25 seconds to process a 1000 by 1000 matrix on my machine, which is ludacris to me as a mainly c++ developer. I wrote serial code in C that executes the entire 10,000 by 10,000 in constant time in less than a second. I think the main bottleneck is the multiprocessing code, but I am required to do this with multiprocessing. Any ideas for how I could go about improving this? Each row can be processed entirely separately but they need to be joined together back into a matrix for my actual code.
import random
from multiprocessing import Pool
import time
def addMatrixRow(matrixData):
matrix = matrixData[0]
rowNum = matrixData[1]
del (matrixData)
rowSum = 0
for colNum in range(len(matrix[rowNum])):
rowSum += matrix[rowNum][colNum]
return rowSum
def genMatrix(row, col):
matrix = list()
for i in range(row):
matrix.append(list())
for j in range(col):
matrix[i].append(random.randint(0, 1))
return matrix
def main():
matrix = genMatrix(1000, 1000)
print("generated matrix")
MAX_PROCESSES = 4
finalSum = 0
processPool = Pool(processes=MAX_PROCESSES)
poolData = list()
start = time.time()
for i in range(100):
for rowNum in range(len(matrix)):
matrixData = [matrix, rowNum]
poolData.append(matrixData)
finalData = processPool.map(addMatrixRow, poolData)
poolData = list()
finalSum += sum(finalData)
end = time.time()
print(end-start)
print(f'final sum {finalSum}')
if __name__ == '__main__':
main()