Why is numpy.linalg.pinv() preferred over numpy.linalg.inv() for creating inverse of a matrix in linear regression

Question

If we want to search for the optimal parameters theta for a linear regression model by using the normal equation with:

theta = inv(X^T * X) * X^T * y

one step is to calculate inv(X^T*X). Therefore numpy provides np.linalg.inv() and np.linalg.pinv()

Though this leads to different results:

X=np.matrix([[1,2104,5,1,45],[1,1416,3,2,40],[1,1534,3,2,30],[1,852,2,1,36]])
y=np.matrix([[460],[232],[315],[178]])

XT=X.T
XTX=XT@X

pinv=np.linalg.pinv(XTX)
theta_pinv=(pinv@XT)@y
print(theta_pinv)

[[188.40031946]
 [  0.3866255 ]
 [-56.13824955]
 [-92.9672536 ]
 [ -3.73781915]]

inv=np.linalg.inv(XTX)
theta_inv=(inv@XT)@y
print(theta_inv)

[[-648.7890625 ]
 [   0.79418945]
 [-110.09375   ]
 [ -74.0703125 ]
 [  -3.69091797]]

The first output, that is the output of pinv is the correct one and additionally recommended in the numpy.linalg.pinv() docs. But why is this and where are the differences / Pros / Cons between inv() and pinv().

Your X matrix is 4x5 and so has rank at most 4; therefore X'*X which is 5x5 has rank at most 4 and is not invertible. Rounding error may mean that a routine calculates an 'inverse' but it won't be the inverse, as there isn't one. — dmuir, Mar 19 '18 at 08:52
Just to be clear, neither is recommended: we should use np.linalg.lstsq for this purpose, which avoids explicitly computing any new matrix. — John Madden, Nov 02 '22 at 21:42
This is not meant as an answer to your question, but it seems to me like you are misusing pinv slightly. While it's best to use `solve` with `X.T @ X` or lstsq with `X` as everyone else says; if you were intent on using `pinv`, the better way to do it would be: `theta_inv = np.linalg.pinv(X) @ y` which will still produce the same answer as your first calculation, and the same answer as `np.solve((X.T @ X), X.T @ y)` and `np.linalg.lstsq(X, y)`. The large error in your second calculation is a great example of why not to directly calculate the inverse when solving. — Marses, Dec 22 '22 at 13:14
My mistake, `np.linalg.solve` will not produce the same answer — Marses, Dec 22 '22 at 13:28

score 31 · Accepted Answer · answered Mar 19 '18 at 07:21

31

If the determinant of the matrix is zero it will not have an inverse and your inv function will not work. This usually happens if your matrix is singular.

But pinv will. This is because pinv returns the inverse of your matrix when it is available and the pseudo inverse when it isn't.

The different results of the functions are because of rounding errors in floating point arithmetic

You can read more about how pseudo inverse works here

answered Mar 19 '18 at 07:21

Vedant Shetty

1,577
13
14

Can you explain why pinv and inv results are comparable? – mavavilj Mar 07 '19 at 21:34
I don't see why it would not. Just because it is a different approach to get an answer doesn't mean it will be different. – Vedant Shetty Mar 09 '19 at 03:08
Yes, but can one also demonstrate that. Does it imply it in documentation, now it's just your opinion. – mavavilj Mar 09 '19 at 08:55
2

I really don't want to be unhelpful. But please check out the formula for inverse of a matrix and moore- Penrose inverse. It's actually a math concept. – Vedant Shetty Mar 10 '19 at 11:09
I really don't want to be unhelpful. But please check out the formula for inverse of a matrix and moore- Penrose inverse. It's actually a math concept. – Vedant Shetty Mar 10 '19 at 11:10
Might be worth pointing out that, in the case that it's possible to calculate the matrix inverse, the (Moore-Penrose) pseudo-inverse is (mathematically) equivalent. This isn't an issue of code documentation, it's a mathematical construct (in the same way that a matrix inverse is a mathematical construct). – alwaysmpe Sep 26 '21 at 22:45

score 12 · Answer 2 · answered Mar 19 '18 at 13:55

12

inv and pinv are used to compute the (pseudo)-inverse as a standalone matrix. Not to actually use them in the computations.

For such linear system solutions the proper tool to use is numpy.linalg.lstsq (or from scipy) if you have a non invertible coefficient matrix or numpy.linalg.solve (or from scipy) for invertible matrices.

answered Mar 19 '18 at 13:55

percusse

3,006
1
14
28

Are `inv` and `pinv` "comparable" (i.e. do they produce closely the same result, when used as part of computation). – mavavilj Mar 09 '19 at 08:57

score 1 · Answer 3 · answered Apr 23 '22 at 22:53

Inverting X^T X is not as numerically stable as computing the pseudo-inverse using numpy's function when X^T X has relatively very small singular values, which can occur when a matrix is "almost" rank-deficient but isn't due to noise. This occurs in particular when the condition number, that is the ratio of max sv to min sv, is large.

Why is numpy.linalg.pinv() preferred over numpy.linalg.inv() for creating inverse of a matrix in linear regression

3 Answers3