BLAS: gemm vs. gemv

Question

Why does BLAS have a gemm function for matrix-matrix multiplication and a separate gemv function for matrix-vector multiplication? Isn't matrix-vector multiplication just a special case of matrix-matrix multiplication where one matrix only has one row/column?

[dgemm](http://www.netlib.org/blas/dgemm.f) and [dgemv](http://www.netlib.org/blas/dgemv.f): F77 double versions of the discussion functions for the curious. Also just wanna inject that is a really important (and often used) special case where special optimizations might be possible even if that doesn't show in the f77 versions. — user786653, Aug 15 '11 at 16:48
also interesting to compare performance of gemm and gemv for vector-matrix multiplication. — constructor, Dec 30 '15 at 10:20

Stephen Canon · Accepted Answer · 2011-08-15T16:54:40.327

Mathematically, matrix-vector multiplication is a special case of matrix-matrix multiplication, but that's not necessarily true of them as realized in a software library.

They support different options. For example, gemv supports strided access to the vectors on which it is operating, whereas gemm does not support strided matrix layouts. In the C language bindings, gemm requires that you specify the storage ordering of all three matrices, whereas that is unnecessary in gemv for the vector arguments because it would be meaningless.

Besides supporting different options, there are families of optimizations that might be performed on gemm that are not applicable to gemv. If you know that you are doing a matrix-vector product, you don't want the library to waste time figuring out that's the case before switching into a code path that is optimized for that case; you'd rather call it directly instead.

gemm uses the `lda, ldb, ldc` arguments which are the row/column strides and with them you can express the same thing for a column matrix as the `inc` parameter when passing a vector. So it ends up equivalent. — bluss, Mar 16 '16 at 22:18

score 6 · Answer 2 · answered Dec 01 '14 at 12:04

6

When you optimize gemv and gemm different techniques apply:

For the matrix-matrix operation you are using blocked algorithms. Block sizes depend on cache sizes.
For optimising the matrix-vector product you use so called fused Level 1 operations (e.g. fused dot-products or fused axpy).

Let me know if you want more details.

answered Dec 01 '14 at 12:04

Michael Lehn

2,934
2
17
19

1

is it possible to say, gemv() in most cases has better performance than gemm() ? – constructor Dec 30 '15 at 10:14
2

Yes, for an actual matrix-vector product gemv has better performance (assuming you don't compare a bad gemv implementation with a good gemm implementation). Having said that, with a gemv operation you never can achieve peak performance. So the trick for in numerical linear algebra is finding algorithmic variants (so called block algorithms) that utilise matrix-matrix products. – Michael Lehn Jan 01 '16 at 17:09

score 2 · Answer 3 · answered Aug 15 '11 at 16:38

2

I think it just fits the BLAS hierarchy better with its level 1 (vector-vector), level 2 (matrix-vector) and level 3 (matrix-matrix) routines. And it maybe optimizable a bit better if you know it is only a vector.

answered Aug 15 '11 at 16:38

Christian Rau

45,360
10
108
185

BLAS: gemm vs. gemv

3 Answers3