I am trying to implement Strassen matrix multiplication algorithm in both sequential and parallel. I want the following code to run parallel but I have no experience with parallel programming.
int[][] M1 = multiply(add(A11, A22), add(B11, B22));
int[][] M2 = multiply(add(A21, A22), B11);
int[][] M3 = multiply(A11, sub(B12, B22));
int[][] M4 = multiply(A22, sub(B21, B11));
int[][] M5 = multiply(add(A11, A12), B22);
int[][] M6 = multiply(sub(A21, A11), add(B11, B12));
int[][] M7 = multiply(sub(A12, A22), add(B21, B22));
I searched through the web and found a possible solution and arranged the code again:
ExecutorService executor = Executors.newCachedThreadPool();
List<FutureTask<int[][]>> taskList1 = new ArrayList<FutureTask<int[][]>>();
// Start thread for the first half of the numbers
FutureTask<int[][]> futureTask_2 = new FutureTask<int[][]>(new Callable<int[][]>() {
@Override
public int[][] call() throws InterruptedException, ExecutionException {
return multiply(add(A11, A22), add(B11, B22));
}
});
FutureTask<int[][]> futureTask_3 = new FutureTask<int[][]>(new Callable<int[][]>() {
@Override
public int[][] call() throws InterruptedException, ExecutionException {
return multiply(add(A21, A22), B11);
}
});
FutureTask<int[][]> futureTask_4 = new FutureTask<int[][]>(new Callable<int[][]>() {
@Override
public int[][] call() throws InterruptedException, ExecutionException {
return multiply(A11, sub(B12, B22));
}
});
FutureTask<int[][]> futureTask_5 = new FutureTask<int[][]>(new Callable<int[][]>() {
@Override
public int[][] call() throws InterruptedException, ExecutionException {
return multiply(A22, sub(B21, B11));
}
});
FutureTask<int[][]> futureTask_6 = new FutureTask<int[][]>(new Callable<int[][]>() {
@Override
public int[][] call() throws InterruptedException, ExecutionException {
return multiply(add(A11, A12), B22);
}
});
FutureTask<int[][]> futureTask_7 = new FutureTask<int[][]>(new Callable<int[][]>() {
@Override
public int[][] call() throws InterruptedException, ExecutionException {
return multiply(sub(A21, A11), add(B11, B12));
}
});
FutureTask<int[][]> futureTask_8 = new FutureTask<int[][]>(new Callable<int[][]>() {
@Override
public int[][] call() throws InterruptedException, ExecutionException {
return multiply(sub(A12, A22), add(B21, B22));
}
});
taskList1.add(futureTask_2);
taskList1.add(futureTask_3);
taskList1.add(futureTask_4);
taskList1.add(futureTask_5);
taskList1.add(futureTask_6);
taskList1.add(futureTask_7);
taskList1.add(futureTask_8);
executor.execute(futureTask_2);
executor.execute(futureTask_3);
executor.execute(futureTask_4);
executor.execute(futureTask_5);
executor.execute(futureTask_6);
executor.execute(futureTask_7);
executor.execute(futureTask_8);
FutureTask<int[][]> ftrTask = taskList1.get(0);
final int[][] M1 = ftrTask.get();
FutureTask<int[][]> ftrTask1 = taskList1.get(1);
final int[][] M2 = ftrTask1.get();
FutureTask<int[][]> ftrTask2 = taskList1.get(2);
final int[][] M3 = ftrTask2.get();
FutureTask<int[][]> ftrTask3 = taskList1.get(3);
final int[][] M4 = ftrTask3.get();
FutureTask<int[][]> ftrTask4 = taskList1.get(4);
final int[][] M5 = ftrTask4.get();
FutureTask<int[][]> ftrTask5 = taskList1.get(5);
final int[][] M6 = ftrTask5.get();
FutureTask<int[][]> ftrTask6 = taskList1.get(6);
final int[][] M7 = ftrTask6.get();
executor.shutdown();
When I run the program for small number of array dimensions like 2,4,8,16 it works almost the same amount of time as in sequential version. For large dimensions like 100, 1000, it calculates the result in much longer time then the sequential version. Is my parallel implementation wrong?