1

I have implemented the usual matrix multiplication algorithm and Strassen's algorithm for matrix multiplication. Both algorithms are implemented on Rust. The main advantage of Strassen's algorithm is that it saves one multiplication operation, which is replaced with summations.

I set the base case of recursion to 1, meaning that I continue splitting matrices until I get matrices of size 1x1. Based on my approximate calculations of mathematical operations in both algorithms, Strassen's algorithm should be faster because it reduces the number of multiplications required. However, when I tested it, Strassen's algorithm was slower.

I conducted several measurements of the time required for addition and multiplication, and it turned out that the required time was almost equal for both operations. So, my question is, how is Strassen's algorithm faster if multiplication is not significantly slower than addition?

Д Т
  • 45
  • 1
  • 6
  • 1
    On modern processors, multiplication isn't much if any slower than addition. Cache locality is usually more of an optimisation goal with matrix multiplication. – Simon Goater Mar 29 '23 at 10:49
  • 1
    This is answered on the wikipedia page for Strassen's algorithm: https://en.wikipedia.org/wiki/Strassen_algorithm#Asymptotic_complexity (paragraph starting "Strassen's algorithm needs to be compared to the naive way...") – Paul Hankin Mar 29 '23 at 11:02
  • 1
    To my knowledge, none of the most popular linear algebra libraries (e.g. eigen) use Strassen's algorithm ... – Damien Mar 29 '23 at 12:00

1 Answers1

2

Scalar and matrix operations are different

Imagine that you multiply two 1024x1024 matrices. Strassen's recursive block matrix approach would split these matrices in four 512x512 matrices each. It then makes a significant difference if you had to perform eight multiplications of 512x512 matrices or just seven.

For scalar numbers, elementary operations like add and multply are of comparable speed. But this is not the case if the operands are matrices.

Addition of two n x n matrices takes operations. Multiplication with the schoolbook method takes multiplications and and n³−n² additions .

Limiting the recursion - Crossover Number

In practice, the recursion is terminated for n < 32. This crossover number depends on implementation technology. Smaller matrices are multiplied in the traditional way, and only larger matrices are split in four blocks to apply the Strassen method.

Practical experiences

An overview with code samples in Python, Java and C++ can be found here.

In my own experiments in C, I achieved runtimes of 4.2s for a 2048 x 2048 (64-bit double) matrix multiplication. The classical method took 114s. The Strassen method does involve a lot of memory allocation. For my example, I had to allocate 369 MBytes of memory.

Axel Kemper
  • 10,544
  • 2
  • 31
  • 54