-2

I have this matrix

array([[  4.58437363e-04,   6.87656045e-03,   6.87656045e-02,
      1.10652674e-01,   1.65979011e+00,   1.65979011e+01,
      3.66749891e-03,   5.50124836e-02,   5.50124836e-01,
      8.85221390e-01,   1.32783208e+01,   1.32783208e+02],
   [  6.99837609e-02,   1.39967522e+00,   1.43466710e+01,
      1.46232455e-01,   2.92464911e+00,   2.99776533e+01,
      2.53691133e-01,   5.07382267e+00,   5.20066823e+01,
      5.30092650e-01,   1.06018530e+01,   1.08668993e+02],
   [  2.84535496e-03,   3.41442595e-02,   4.83710343e-01,
      1.84948072e-01,   2.21937687e+00,   3.14411723e+01,
      1.23061602e-02,   1.47673922e-01,   2.09204723e+00,
      7.99900413e-01,   9.59880495e+00,   1.35983070e+02],
   [  3.31593931e-02,   2.32115752e-01,   6.46608166e+00,
      2.42778797e-01,   1.69945158e+00,   4.73418654e+01,
      8.70102475e-02,   6.09071733e-01,   1.69669983e+01,
      6.37051563e-01,   4.45936094e+00,   1.24225055e+02]])

When I calculate the inverse in Excel using the array formula

=MINVERSE(B27:M38)

I get

-1.68568E+17    -1.32003E+15    -8.52171E+14    4.94857E+16 -7.86173E+15    4.96639E+14 -1.13617E+17    2.33346E+15 6.79271E+14 -4.96618E+16    2.85561E+15 4.09202E+13
-3.06571E+16    -5.59088E+14    2.25152E+14 1.31473E+15 3.07834E+14 -5.62038E+13    4.52019E+15 1.70789E+14 -2.28526E+13    4.20304E+15 -2.45416E+14    -1.47953E+12
1.81455E+14 5.59212E+13 -6.6655E+12 2.67828E+14 -3.69694E+13    -1.67831E+12    1.97479E+14 2.98161E+13 -2.22862E+12    5.85569E+13 -7.57931E+12    8.07625E+11
7.85428E+13 -5.42549E+14    2.63919E+13 -2.54175E+16    7.29192E+14 9.38632E+13 1.42654E+16 -3.63692E+13    -6.24915E+13    5.70179E+15 -1.68685E+14    -2.09159E+13
-6.01158E+14    5.7767E+13  -3.66461E+13    1.08182E+15 -1.46389E+14    4.3594E+12  -6.35683E+14    1.32624E+13 9.66142E+12 -9.67009E+14    4.0771E+13  2.74315E+12
-1.03323E+14    -1.55034E+13    2.05109E+12 5.20197E+13 6.17586E+12 -7.19368E+11    6.94553E+13 6.31804E+11 -2.80104E+11    4.07483E+13 -1.84912E+12    -1.18293E+11
5.18468E+16 4.88032E+14 -2.99218E+14    -8.71756E+15    1.28383E+15 1.89111E+14 6.82672E+15 -3.5848E+15 1.508E+14   -2.18052E+16    1.26861E+15 -1.32931E+13
7.49045E+15 2.11314E+14 -2.67962E+13    -3.9188E+14 -5.12058E+13    7.56573E+12 -1.57588E+15    3.06765E+13 -5.03822E+12    -2.60028E+14    1.91038E+13 -1.27174E+11
1.51294E+14 -1.5578E+13 1.14145E+12 -1.22333E+14    1.23247E+13 -3.1036E+11 6.15507E+13 -1.9115E+12 -1.0148E+11 7.22411E+13 -3.48559E+12    -1.47163E+11
2.16878E+16 6.9913E+14  -7.80275E+13    -1.01657E+15    -2.05019E+14    1.06245E+13 -5.32103E+15    6.6262E+13  -1.03726E+13    -1.95913E+15    4.15017E+13 1.10843E+13
-9.77565E+14    -7.57216E+13    4.71499E+12 -1.08066E+14    2.16929E+13 -4.34593E+11    5.51293E+14 -4.82981E+11    -4.12064E+11    1.55974E+14 -4.4425E+12 -7.30823E+11
-2.95506E+13    4.54732E+12 2.37753E+11 1.9272E+13  -3.87232E+11    -75168166104    -3.20143E+13    -5.46016E+11    75013747465 40232740147 7907929619  -2339538070

However, when I calculate the inverse using python using numpy

ata_inv=numpy.linalg.inv(ata)

I get this

array([[ -2.92174491e+17,   8.78092304e+13,  -3.74278555e+14,
      6.85675722e+16,  -7.88519831e+15,   3.97491106e+14,
     -1.19443836e+17,   1.96271342e+15,   7.75339916e+14,
     -4.31451830e+16,   2.66377305e+15,   1.34678664e+13],
   [ -1.20148124e+16,  -1.39577173e+14,   1.62609200e+14,
     -2.05343938e+15,   2.47311333e+14,  -3.70464968e+13,
      2.13473015e+15,   2.02356660e+14,  -3.77977161e+13,
      3.19465019e+15,  -2.20164429e+14,   3.94769138e+12],
   [  1.03452264e+14,   4.49963139e+13,  -5.83232093e+12,
      3.62478360e+14,  -3.57104899e+13,  -2.31688528e+12,
      2.19120097e+14,   3.04727893e+13,  -2.15230839e+12,
      6.27499323e+13,  -8.08233158e+12,   8.14399730e+11],
   [  8.99729769e+15,  -7.33200660e+14,  -2.07714396e+13,
     -2.58353689e+16,   6.90934358e+14,   9.91927528e+13,
      1.47797299e+16,  -3.93583461e+12,  -6.50452782e+13,
      4.86796183e+15,  -1.39660309e+14,  -1.81127868e+13],
   [ -3.48430812e+15,   1.12742047e+12,  -2.70225162e+13,
      1.67561520e+15,  -1.40107200e+14,   1.14260887e+12,
     -3.44439537e+14,   8.36705528e+12,   1.21738153e+13,
     -8.28685468e+14,   3.72852679e+13,   1.98978705e+12],
   [  6.58910302e+13,  -1.22902337e+13,   1.39683509e+12,
      2.27296720e+13,   5.47359894e+12,  -5.34359515e+11,
      5.07774800e+13,   9.33372125e+11,  -3.97227009e+11,
      3.00641665e+13,  -1.53501520e+12,  -6.79505533e+10],
   [  4.59914066e+15,  -9.70909800e+14,  -2.45271675e+14,
     -3.96303832e+15,   1.32677231e+15,   1.79898008e+14,
      1.71089473e+16,  -3.68415009e+15,   2.01159781e+14,
     -2.02750198e+16,   1.29185320e+15,  -2.94060796e+13],
   [  5.29168020e+15,   1.89585534e+14,  -1.64697929e+13,
     -9.16154758e+12,  -4.07498605e+13,   4.84728269e+12,
     -1.45007322e+15,   2.63347736e+13,  -3.81669434e+12,
     -9.78100475e+13,   1.34239556e+13,  -7.53836147e+11],
   [  2.82268043e+14,  -1.59196485e+13,   7.63324745e+11,
     -1.48459210e+14,   1.26222499e+13,  -2.02131815e+11,
      6.36919902e+13,  -1.56430001e+12,  -2.38728387e+11,
      6.81867356e+13,  -3.41941515e+12,  -1.22099795e+11],
   [  1.52015592e+16,   6.32482966e+14,  -4.52320855e+13,
     -5.91644074e+12,  -1.65167701e+14,   2.55304641e+12,
     -4.89017768e+15,   5.33549362e+13,  -7.52567672e+12,
     -1.41538267e+15,   2.19297078e+13,   9.07563720e+12],
   [ -4.64671981e+14,  -7.43091440e+13,   2.03611340e+12,
     -1.63922466e+14,   1.83754071e+13,   9.08311874e+10,
      5.27684329e+14,   7.77828696e+11,  -5.83508093e+11,
      1.08824465e+14,  -2.76885598e+12,  -5.61136635e+11],
   [ -2.89682527e+13,   5.34539735e+12,   2.84749412e+11,
      1.45849833e+13,  -2.84208676e+11,  -6.06452748e+10,
     -3.44726894e+13,  -5.87172107e+11,   5.76165423e+10,
      1.36639289e+12,  -4.78031299e+10,  -4.71743402e+09]])

Why are these answers different? And which one is correct, if either?

In the end, I just want to find the least squares estimate of a polynomial. I thought I'd do it using inverses, but I am now thinking this is not the best approach.

Matt Cremeens
  • 4,951
  • 7
  • 38
  • 67
  • 2
    What if you test a smaller 2x2 or 3x3 matrix first? There you can also do the inversion by hand without much problem. – Ignacio Vergara Kausel Nov 03 '17 at 11:24
  • 1
    Why don't you multiply the inverses and the originals, then see which result is closest to the identity matrix? That should immediately tell you which answer is most correct. – Tom Karzes Nov 03 '17 at 11:26
  • 2
    The correct one is the one that gives you `dot(A,Ainv)==I` – gboffi Nov 03 '17 at 11:26
  • Anyhow, I'd worry more about the humongous values that you get in both cases. Hope that you know what you're doing with your original matrix with very "wild" elements ranging from `0.006005578` to `63363.59016`. It might be that such differences can make excel or numpy lose precision/stability. – Ignacio Vergara Kausel Nov 03 '17 at 11:33
  • I'm sorry. The matrix is square. I pasted the wrong one in initially. Seem my edit. – Matt Cremeens Nov 03 '17 at 11:47
  • 1
    Possibly no one is correct, for the reason outlined by Ignacio and Paul. I have inverted your 12x12 matrix with numpy linked to a different numerical library and I've got still different results from excel and yours copy of numpy... – gboffi Nov 03 '17 at 11:52
  • @gboffi are you telling me that with matrices of this kind I may never be guaranteed an accurate result using a computer? – Matt Cremeens Nov 03 '17 at 12:30
  • There is a difference between using 64-bit-FP-based linear-algebra tools (speed!) and all kinds of extended precision methods or even symbolic-calculation. Your question then is hard to answer without a very specific case of data and algorithm in mind. ```1/3``` won't give you an accurate result either (for the former 2 approaches)! – sascha Nov 03 '17 at 12:33
  • I keep getting downvoted. Is there anything I can do to improve this question? Does it simply not belong here? – Matt Cremeens Nov 03 '17 at 12:35
  • @gboffi by the way, doing `dot(A,Ainv)` on both inverses (excel and python) neither gave me the identity matrix. :( – Matt Cremeens Nov 03 '17 at 12:41
  • That's useless until you tell us how big the residual is. And mixing stuff like numpy's inv and numpy's lstsq (two very different approaches) does not make this question better! (and i'm ignoring the fact that the matrix is still non-square in your post; maybe there are some hidden/obfuscated AAt ops, but it's not nice to read) – sascha Nov 03 '17 at 12:42
  • @sascha You're right. I keep copying the wrong matrix. It's a 12x12. – Matt Cremeens Nov 03 '17 at 12:46
  • As I said, given the numbers you most likely are running into stability issues. You can try and see if using http://mpmath.org/ to increase precision could help you, but I think it's highly unlikely. Simply put, your matrix is ill defined for computational inversion. – Ignacio Vergara Kausel Nov 03 '17 at 12:54
  • @IgnacioVergaraKausel Thank you. This is unfortunate as it has come up in a practical problem I am trying to solve. – Matt Cremeens Nov 03 '17 at 13:01
  • 1
    Can't be that important if you don't post your error-residuals (of your identity-check) or edit your matrix to some square-matrix we can try to invert too. :-) – sascha Nov 03 '17 at 13:03
  • @sascha You're right. I need to provide the info you are requesting, but the computer that this program resides on is not connected to the internet at the moment. I need a little time. In your opinion, would it be better for me to us `lstsq` in numpy to get the coefficients to the polynomial I'm after? – Matt Cremeens Nov 03 '17 at 13:13
  • 1
    @MattCremeens well, you've never said what for you need the inverse... even less about coefficients about a polynomial. That would help a lot regarding the solution that can be reached. – Ignacio Vergara Kausel Nov 03 '17 at 13:23
  • @IgnacioVergaraKausel I am sorry. I am rusty when it comes to matrix algebra. I edited my question to provide the end purpose of inverting the matrix in the first place. I am really just after a least-squares estimate that I can rely on. I was trying to do it line-by-line, but perhaps there is a better way. – Matt Cremeens Nov 03 '17 at 13:25
  • 2
    Yes... using the inverse is wrong (or at least very bad practice). Look at scipy's curve_fit or optimize module in general. – sascha Nov 03 '17 at 13:25
  • @sascha I had no idea. I was just following the practice provided in a textbook and it is reminiscent of what I learned while getting my undergrad. Thanks for this tip. I will look into it. – Matt Cremeens Nov 03 '17 at 13:33

1 Answers1

1

There are two reasons you are getting different answers

Fist, there is an issue of numerical instability. Take a look at the values in the inverses - they are very large. This means that the original matrix you are trying to invert has at least one singular value that is very small, meaning that the matrix is difficult to invert.

Secondly, you're trying to invert a non-square matrix. Matrix inverses are actually only unique in the case of square matrices. For non-square matrices, officially matrix inverses don't exists. Instead, the functions you are calling will be finding the left- or right-inverses, but these are in general not unique, and so two different but completely valid implementations of algorithms to find these inverses may give different answers.

Paul Rubenstein
  • 342
  • 1
  • 2
  • 12
  • `A=A.reshape((12,12))` – gboffi Nov 03 '17 at 11:37
  • As should be clear from `MINVERSE(B27:M38)` because `38-26==12` and `ord('M)-ord('A')==12` — please REMOVE the second part of your answer and correct the title. – gboffi Nov 03 '17 at 11:45
  • I appreciate this answer. I used numpys `lstsq` and got an answer consistent with the python inversion. Am I simply not guaranteed an accurate result? – Matt Cremeens Nov 03 '17 at 12:29