Questions tagged [matrix-multiplication]

Questions related to matrix multiplication, especially implementation. Mathematical questions should consider the [linear-algebra] tag.

Matrix multiplication is usually the process of multiplying two (or more) matrices. This arises quite frequently in linear algebra contexts and is a particularly fundamental task in computing, especially in scientific computing.

To that end, a number of fundamental libraries, such as LAPACK, BLAS, ATLAS, and others have been developed. Because the growth of matrices affects the computational time, extensive effort has been made to optimize these packages for various computer architectures and various matrix sizes.

In scientific software for statistical computing and graphics, operator %*% performs matrix multiplication (see ?"%*%"), and interfaces BLAS routine dgemmm.


The product of the multiplication of two matrices a and b is the matrix c, where each element is the sum of the products of the i-th row of a and the j-th column of b.

c[i][j] += a[i][k] * b[k][j];

Example:

     (b) { 1,  2,  3,  4}
         { 5,  6,  7,  8}
(a)    ┌(c)──────────────
{1, 2} │ {11, 14, 17, 20}
{3, 4} │ {23, 30, 37, 44}
{5, 6} │ {35, 46, 57, 68}

Algorithm of the matrix multiplication:

// rows of 'a' matrix
int m = 3;
// columns of 'a' matrix
// and rows of 'b' matrix
int n = 2;
// columns of 'b' matrix
int p = 4;
// matrices 'a=m×n', 'b=n×p'
int[][] a = {{1, 2}, {3, 4}, {5, 6}},
        b = {{1, 2, 3, 4}, {5, 6, 7, 8}};
// resulting matrix 'c=m×p'
int[][] c = new int[m][p];
// iterate over the rows of the 'a' matrix
for (int i = 0; i < m; i++) {
    // iterate over the columns of the 'b' matrix
    for (int j = 0; j < p; j++) {
        // iterate over the columns of the 'a'
        // matrix, aka rows of the 'b' matrix
        for (int k = 0; k < n; k++) {
            // sum of the products of
            // the i-th row of 'a' and
            // the j-th column of 'b'
            c[i][j] += a[i][k] * b[k][j];
        }
    }
}
2901 questions
22
votes
4 answers

Laderman's 3x3 matrix multiplication with only 23 multiplications, is it worth it?

Take the product of two 3x3 matrices A*B=C. Naively this requires 27 multiplications using the standard algorithm. If one were clever, you could do this using only 23 multiplications, a result found in 1973 by Laderman. The technique involves saving…
Hooked
  • 84,485
  • 43
  • 192
  • 261
20
votes
4 answers

Why does the order of loops in a matrix multiply algorithm affect performance?

I am given two functions for finding the product of two matrices: void MultiplyMatrices_1(int **a, int **b, int **c, int n){ for (int i = 0; i < n; i++) for (int j = 0; j < n; j++) for (int k = 0; k < n; k++) …
kevlar1818
  • 3,055
  • 6
  • 29
  • 43
20
votes
3 answers

OpenMP C++ Matrix Multiplication run slower in parallel

I'm learning the basics of paralel execution of for loop using OpenMP. Sadly, my paralel program runs 10x slower than serial version. What am I doing wrong? Am I missing some barriers? double **basicMultiply(double **A, double **B, int size) { …
Hynek Blaha
  • 633
  • 2
  • 6
  • 8
19
votes
14 answers

Multiply 2 matrices in Javascript

I'm doing a function that multiplies 2 matrices. The matrices will always have the same number of rows and columns. (2x2, 5x5, 23x23, ...) When I print it, it doesn't work. Why? For example, if I create two 2x2…
Jordi 45454
  • 275
  • 2
  • 3
  • 11
19
votes
1 answer

loop tiling/blocking for large dense matrix multiplication

I was wondering if someone could show me how to use loop tiling/loop blocking for large dense matrix multiplication effectively. I am doing C = AB with 1000x1000 matrices. I have followed the example on Wikipedia for loop tiling but I get worse…
user2088790
18
votes
2 answers

Is there a Java library for better linear regression? (E.g., iteratively reweighted least squares)

I am struggling to find a way to perform better linear regression. I have been using the Moore-Penrose pseudoinverse and QR decomposition with JAMA library, but the results are not satisfactory. Would ojAlgo be useful? I have been hitting…
18
votes
1 answer

Make the matrix multiplication operator @ work for scalars in numpy

In python 3.5, the @ operator was introduced for matrix multiplication, following PEP465. This is implemented e.g. in numpy as the matmul operator. However, as proposed by the PEP, the numpy operator throws an exception when called with a scalar…
18
votes
14 answers

How to efficiently store a matrix with highly-redundant values

I have a very large matrix (100M rows by 100M columns) that has a lots of duplicate values right next to each other. For example: 8 8 8 8 8 8 8 8 8 8 8 8 8 8 4 8 8 1 1 1 1 1 8 8 8 8 8 4 8 8 1 1 1 1 1 8 8 8 8 8 4 8 8 1 1 1 1 1 8 8 8 8 8 4 8 8 1 1 1…
17
votes
3 answers

Why can GPU do matrix multiplication faster than CPU?

I've been using GPU for a while without questioning it but now I'm curious. Why can GPU do matrix multiplication much faster than CPU? Is it because of parallel processing? But I didn't write any parallel processing code. Does it do it…
aerin
  • 20,607
  • 28
  • 102
  • 140
17
votes
1 answer

How to get faster code than numpy.dot for matrix multiplication?

Here Matrix multiplication using hdf5 I use hdf5 (pytables) for big matrix multiplication, but I was suprised because using hdf5 it works even faster then using plain numpy.dot and store matrices in RAM, what is the reason of this behavior? And…
mrgloom
  • 20,061
  • 36
  • 171
  • 301
17
votes
5 answers

Speeding up element-wise array multiplication in python

I have been playing around with numba and numexpr trying to speed up a simple element-wise matrix multiplication. I have not been able to get better results, they both are basically (speedwise) equivalent to numpys multiply function. Has anyone had…
JEquihua
  • 1,217
  • 3
  • 20
  • 40
17
votes
1 answer

Efficient 4x4 matrix vector multiplication with SSE: horizontal add and dot product - what's the point?

I am trying to find the most efficient implementation of 4x4 matrix (M) multiplication with a vector (u) using SSE. I mean Mu = v. As far as I understand there are two primary ways to go about this: method 1) v1 = dot(row1, u), v2 = dot(row2,…
user2088790
16
votes
2 answers

Simple CUBLAS Matrix Multiplication Example?

I'm looking for a very bare bones matrix multiplication example for CUBLAS that can multiply M times N and place the results in P for the following code, using high-performance GPU operations: float M[500][500], N[500][500], P[500][500]; for(int i =…
Chris Redford
  • 16,982
  • 21
  • 89
  • 109
15
votes
4 answers

Type-safe matrix multiplication

After the long-winded discussion at Write this Scala Matrix multiplication in Haskell, I was left wondering...what would a type-safe matrix multiplication look like? So here's your challenge: either link to a Haskell implementation, or implement…
Dan Burton
  • 53,238
  • 27
  • 117
  • 198
15
votes
4 answers

Why is matrix multiplication in .NET so slow?

I don't quite understand what makes matrix multiplication in C#/.NET (and even Java) so slow. Take a look at this benchmark (source): Trying to find an updated benchmark. C#'s integer and double performance is damn close to C++ compiled with…
Matthew Olenik
  • 3,577
  • 1
  • 28
  • 31