0

The problem is that the speed is very slow, terrible slow, even under a small n, for example: when n=1024, there must be something wrong, anyone?

I didn't create new matrix C whenever function call, I add the new result to the previous result which is stored in original matrix C when base case occurs.

int **matA,**matB,**matC;

    void matmul_div_rec(int Arow,int Acol,int Brow,int Bcol,int n) {
        if(n==1)
        {
            matC[Arow][Bcol]+=matA[Arow][Acol]*matB[Brow][Bcol];
        }
        else
        {
            matmul_div_rec(Arow+0,Acol+0,Brow+0,Bcol+0,n/2);
            matmul_div_rec(Arow+0,Acol+n/2,Brow+n/2,Bcol+0,n/2);
            matmul_div_rec(Arow+0,Acol+0,Brow+0,Bcol+n/2,n/2);
            matmul_div_rec(Arow+0,Acol+n/2,Brow+n/2,Bcol+n/2,n/2);
            matmul_div_rec(Arow+n/2,Acol+0,Brow+0,Bcol+0,n/2);
            matmul_div_rec(Arow+n/2,Acol+n/2,Brow+n/2,Bcol+0,n/2);
            matmul_div_rec(Arow+n/2,Acol+0,Brow+0,Bcol+n/2,n/2);
            matmul_div_rec(Arow+n/2,Acol+n/2,Brow+n/2,Bcol+n/2,n/2);
        }
        return; }
int main()
{
    matmul_div_rec(0,0,0,0,n); //n must be the power of 2

}
  • 2
    Given the size of your matrices, I am guessing they are sparse matrices (mostly zeros). You can save a lot of space and time if you treat them as such. – doron Jul 01 '17 at 07:58
  • 1
    Read about strassen way to multiplication of matrix, https://stackoverflow.com/questions/4846938/divide-and-conquer-matrix-multiplication http://www.geeksforgeeks.org/strassens-matrix-multiplication/ http://www.geeksforgeeks.org/strassens-matrix-multiplication/ – EsmaeelE Jul 01 '17 at 09:45
  • See https://stackoverflow.com/questions/12922031/recursive-matrix-multiplication – Rafael Coelho Jul 01 '17 at 09:48
  • For n=1024, This performs 2^30~=1B multiply-adds, and makes 1227133512~=1.2B recursive function calls (which also require a few more adds and calculation of n/2). A naive implementation of matrix multiply makes the same number of multiply-adds, but no recursive function calls. – Paul Hankin Jul 01 '17 at 10:28
  • 1
    Paolo D'Alberto maintains a special web site [http://www.fastmmw.com](http://www.fastmmw.com) [FastMMW](http://www.fastmmw.com) devoted to practical matrix multiplication algorithms. – Axel Kemper Jul 03 '17 at 16:04

1 Answers1

0

As briefly discussed in, for instance, this paper, an implementation of Strassen's algorithm (even if it is correct) has a competitive performance only if some additional ideas are used. These include for instance using the so-called Morton order layout for matrix storage in memory to obtain an easier way to address the submatrices and improves the caching behaviour via better memory locality. Furthermore, there is a potential improvement by using SIMD for parallelization of matrix addition at the base case of the recursion.

Codor
  • 17,447
  • 9
  • 29
  • 56
  • The code in the question isn't Strassen. It's a naive matrix multiply algorithm written in a "divide and conquer" recursive style. – Paul Hankin Jul 01 '17 at 10:33