If we want to search for the optimal parameters theta for a linear regression model by using the normal equation with:
theta = inv(X^T * X) * X^T * y
one step is to calculate inv(X^T*X). Therefore numpy provides np.linalg.inv() and np.linalg.pinv()
Though this leads to different results:
X=np.matrix([[1,2104,5,1,45],[1,1416,3,2,40],[1,1534,3,2,30],[1,852,2,1,36]])
y=np.matrix([[460],[232],[315],[178]])
XT=X.T
XTX=XT@X
pinv=np.linalg.pinv(XTX)
theta_pinv=(pinv@XT)@y
print(theta_pinv)
[[188.40031946]
[ 0.3866255 ]
[-56.13824955]
[-92.9672536 ]
[ -3.73781915]]
inv=np.linalg.inv(XTX)
theta_inv=(inv@XT)@y
print(theta_inv)
[[-648.7890625 ]
[ 0.79418945]
[-110.09375 ]
[ -74.0703125 ]
[ -3.69091797]]
The first output, that is the output of pinv is the correct one and additionally recommended in the numpy.linalg.pinv() docs. But why is this and where are the differences / Pros / Cons between inv() and pinv().