Associative matrix multiplication in R

Question

I have two matrices, A and B, and I am attempting to multiply them in the following way:

t(A%*%B)%*%A

Typically, the order of matrix multiplication matters, but in this case B is a diagonal matrix with all elements equal. Therefore, I can write B = c*I where I is the identity matrix with dimension = dim(B). Then, I should be able to obtain my matrix product by computing

t(A)%*%A%*%B

since

t(A%*%B)%*%A = t(A%*%cI)%*%A = t(A%*%I)%*%A*c = t(A)%*%A*c = t(A)%*%A*cI = t(A)%*%A%*%B

However, when I attempt to do this using my matrices, I do not get the same result using both calculations:

A <- data[, 1:4]
B <- diag(c(1 + gamma + (gamma^2)*gamma_R), nrow = 4, ncol = 4)

t(A%*%B)%*%A - t(A)%*%A%*%B

[1,] 1.421085e-14 1.136868e-13      0    0
[2,] 1.136868e-13 0.000000e+00      0    0
[3,] 0.000000e+00 0.000000e+00      0    0
[4,] 0.000000e+00 0.000000e+00      0    0

gamma and gamma_R are just coefficients obtained from a previous regression that I ran. Machine precision is 2.220446e-16, and these errors are several orders of magnitude above machine precision. Is the discrepancy due to the way in which the quantities that I multiply first are truncated due to machine precision, and these get magnified through the summing that occurs when multiplying by the final matrix? Or is there another likely reason? A is only 15X4, and B is 4X4.

If numerical precision is the reason for the discrepancy, then is there a preferred order for doing the operations? Should matrix products always be simplified in such a way that result in the fewest number of calculations?

Just to be clear, your question is *"Is it possible that the discrepancies are still caused by the way in which the quantities that I multiply first are truncated due to machine precision, and these get magnified through the summing that occurs when multiplying by the final matrix?"*, i.e., you want to know whether it's possible that this is the particular reason for the minute differences in your results? Sure, it's possible. — Gregor Thomas, Sep 01 '20 at 18:56
@GregorThomas I want to know whether there is any other possible reason, and whether others have had similar experiences. A comment in which someone says "regardless of the calculation, you'll never see errors like this that are larger than 1e-7 or 1e-8" and a reason why would also be informative. — EB727, Sep 01 '20 at 19:03
I'd suggest editing your question to make that (what you're looking for in an answer) clearer. But I'm skeptical that you'll get an answer that is useful. Numerical imprecision can add up, depending on the operations you're doing, if there are iterations, etc. I frequently see people use `sqrt(.Machine$double.eps)` used as a tolerance, which is about `1.5e-8`, but it's easy to come up with examples that exceed that tolerance. No one can make guarantees because your next step might be to multiply by `1e13`, and all of a sudden your discrepancies are on the order of `0.1` and `1`. — Gregor Thomas, Sep 01 '20 at 19:09
@GregorThomas Yes, that makes sense. I modified the question a bit. Thanks for your tips! — EB727, Sep 01 '20 at 19:18
I don't know about number of operations, but in general you want to avoid adding/subtracting numbers of very different magnitudes, e.g. sums are more precise if the summands are ordered smallest-to-largest. I wouldn't know how this generalizes to matrix operations without way too much effort on my part ... — Ben Bolker, Sep 01 '20 at 19:24
Because the mantissa of a floating-point number has limited precision. Add two numbers that differ by a factor of more than `1/.Machine$double.eps` and you'll get a catastrophic loss of precision (because this is equivalent to computing `1+x` when `x<.Machine$double.eps`) See e.g. https://scicomp.stackexchange.com/questions/10869/which-algorithm-is-more-accurate-for-computing-the-sum-of-a-sorted-array-of-numb ... Could you provide a [mcve] ? — Ben Bolker, Sep 01 '20 at 20:46

Associative matrix multiplication in R

0 Answers0