Python : Rounding errors between C-contiguous and F-contiguous arrays for matrix multiplication

Question

Problem :

I stumbled across a problem that has deep (and very impactful) consequences in the industry, and does not seem to be documented anywhere :

It seems that in python, Matrix Multiplication (using either @ or np.matmul) gives different answers between C-contiguous and F-contiguous arrays :

import platform
print(platform.platform()) # Linux-5.10.0-23-cloud-amd64-x86_64-with-glibc2.31
import numpy as np
print(np.__version__) # 1.23.5

np.random.seed(0) # Will work with most of seed, I believe
M, N = 10, 5

X_c = np.random.normal(size=M*N).reshape(M,N)
X_f = np.asfortranarray(X_c)
assert (X_c == X_f).all()

p = np.random.normal(size=N)
assert (X_c @ p == X_f @ p).all() # FAIL

To your best knowledge, is this problem documented anywhere ?

Example of Consequence :

In some cases, these errors can become huge :

Example : In sklearn.linear_model.LinearRegression class, the fit method will leads to very wrong parameters in the case of near-singular matrix X that is F-contiguous (typically, that came out of a pandas DataFrame). This can leads to prediction all over the place and negative R2.

simon · Answer 1 · 2023-08-07T09:04:57.670

Summary

As I see it, the problem that you noticed is a consequence of the fundamental limitations of floating point arithmetic; in particular

the order of floating-point calculations (in the given case: summation) matters, and
choosing different algorithms that choose different orders thus may lead to slightly different results when given the exact same inputs – neither result of which should be considered "more correct" than the other.

Why the order of summation matters

Consider the example from the code below, in which we want to sum all elements of an array a. We will do this in two different orders: sum_1 sums all values from left to right, sum_2 progressively sums all neighboring pairs of values until only one value remains. Mathematically, this should not make a difference, as summation is an associative operation on real numbers. It is not associative on the floating point numbers on a computer, however, as we see from the output of the last line:

import numpy as np

eps_half = np.finfo(float).eps / 2
a = np.asarray([0.5, 0.5, eps_half, eps_half], dtype=float)

sum_1 = ((a[0] + a[1]) + a[2]) + a[3]
sum_2 =  (a[0] + a[1]) + (a[2] + a[3])
print(sum_1, "versus", sum_2)
# >>> 1.0 versus 1.0000000000000002

What is going on here? Notice the use of eps, which is defined as the difference between 1.0 and the next smallest representable float larger than 1.0. For the float type, at least on my system, the value is ca. 2e-16, i.e. 0.0000000000000002. The definition implies that, if we add a value smaller than eps to 1.0, the result will still be 1.0!

In the constructed example from above, the latter happens in the calculation for sum_1 but not for sum_2, as we can see from their calculation steps:

steps for sum_1:
1. 0.5 + 0.5 → 1.0 add a[0] and a[1]: works as expected
2. 1.0 + eps/2 → 1.0 add a[2] to previous sum: eps/2 is too small for the result to be represented as its own float value larger than 1.0, so the sum remains 1.0
3. 1.0 + eps/2 → 1.0 add a[3] to previous sum: same problem as in step 2
steps for sum_2:
1. 0.5 + 0.5 → 1.0 add a[0] and a[1]: works as expected
2. eps/2 + eps/2 → eps add a[2] and a[3]: works as expected
3. 1.0 + eps → 1.0000000000000002 add results from steps 1 and 2: works as expected, because eps is large enough for the result to be represented as its own float value larger than 1.0

What this has to do with matrix-vector products

You might ask yourself what your observation has to do with my example from above. Here is what is most likely going on:

For each element in the result vector of a matrix-vector product, as is calculated in your code, essentially we calculate a dot product of two vectors: (1) the corresponding row of the given matrix and (2) the given vector. The dot product, by definition, multiplies the corresponding elements of the given vectors, then sums these intermediate results. Now the only option that may give us different results in calculating a matrix-vector product from identical input values may lie in this summation; or rather, its order, as the last step of each dot product. And here is the thing:

We see cancellation effects as in my example above with floating point summations all the time; this is not a particular problem of summing 1.0 and eps, but a general problem of summing large and small floating point values.
In array summation, no summation order is "more correct" or "less correct" than any other; in fact, the closeness of the summation result to the "true result" (i.e. to the result calculated with infinite precision) depends on the order of the values in the array. (Some summation orders in general produce more robust results than others, but discussing this would go too far at this point.)
It is up to the implementation of our matrix-vector multiplication algorithm which summation order to choose, and it might choose differently, e.g. for speed reasons and for different memory layouts of the given data. In your case, you definitively have different memory layouts for your matrix data (C vs Fortran order), so that's what most likely happens: For the C-order array, a different summation order is chosen than for the Fortran-order one, so different cancellation effects happen, leading to (slightly) different results.

What is the impact of all this

I try to summarize some points that should serve as key takeaways:

Floating point numbers have certain limitations that make them behave differently in practice than real numbers behave in mathematical theory. The limitations of floating point arithmetic are well known and should not come as too much of a surprise. If they do come as a surprise, the following almost "classical" and very extensive read might help: D. Goldberg: What Every Computer Scientist Should Know About Floating-Point Arithmetic (I admit, I never fully read it myself).
The effect, in your example, is rather small. On my machine, if I calculate the maximum absolute difference between the two matrix-vector product versions of your code (using the same random seed), I get:
```
print(np.abs(X_c @ p - X_f @ p).max())
# >>> 2.220446049250313e-16
```
… which is very small (incidentally, it is the exact value of eps from above). This does not mean that such imprecisions should be neglected, e.g. they may amplify for progressive calculations.
To deal with such imprecisions, comparing the results of floating-point calculations often favors the use of functions like np.allclose() over exact equality comparison.

I admit, I doubt a bit your final conclusion; namely, that the huge errors that you see in sklearn.linear_model.LinearRegression have the same cause, but maybe it is the result of the aforementioned amplification effects. Maybe you should provide a minimum reproducible example for further investigation there.

Python : Rounding errors between C-contiguous and F-contiguous arrays for matrix multiplication

1 Answers1

Summary

Why the order of summation matters

What this has to do with matrix-vector products

What is the impact of all this