1

I would like to find all the real roots of a univariate polynomial. I could use Jenkins-Traub algorithm for example, but I want to learn how to solve it using the companion matrix.

I know how to turn a polynomial to companion matrix and I found a script which does QR decomposition: http://quantstart.com/articles/QR-Decomposition-with-Python-and-NumPy

And this is where I'm lost: what to do next? I think I have to calculate multiple decompositions but when I do, I always get the same result (obviously). I also read that it might be useful to first convert the companion matrix to Hessenberg form - but how? Then there are "shifts" - what are they?

I also found http://www.nr.com/webnotes/nr3web17.pdf but as I don't understand any of it I would like to know if there is any simpler method (even if slower or less stable).

In other words: reading http://en.wikipedia.org/wiki/QR_algorithm

  • "let A be a real matrix of which we want to compute the eigenvalues"
    ok, that's my companion matrix, right?

  • "we compute the QR decomposition Ak=QkRk"
    that would be Q, R = householder(A) from the very first link, right?

  • "We then form Ak+1 = RkQk"
    easy, just multiply R and Q

  • "Under certain conditions,[2] the matrices Ak converge to a triangular matrix, the Schur form of A. The eigenvalues of a triangular matrix are listed on the diagonal, and the eigenvalue problem is solved."
    ...wait, what? I tried:

    for i in range(100):
        Q, R = householder(A)
        A = mult_matrix(R, Q)
    

but there doesn't seem to be any progress made and I cannot see any numbers even close to the correct roots.

Please, can anyone explain this to me?

PS: I don't want to blindly use LAPACK or similar as I want to understand how it works, at least in very simplified terms.

PPS: There is also http://adorio-research.org/wordpress/?p=184 (not sure how is it different from the first method, though...)

Ecir Hana
  • 10,864
  • 13
  • 67
  • 117

1 Answers1

1

If your companion matrix is the coefficient transformation matrix of the linear polynomial operation that maps q(x) to x*q(x) mod p(x)

where p(x)=x^(n+1)+p_n*x^n+...+p_1*x+p_0.

Explicitely, A has the shape of

0 0 0 ... 0 -p_0
1 0 0 ... 0 -p_1
0 1 0 ... 0 -p_2
. . . ... . . . 
0 0 0 ... 1 -p_n

which already is in Hessenberg form. Since this form is preserved during the QR algorithm, you may use the QR decomposition with Givens rotations where only rotations close to the diagonal occur.

In the unshifted QR algorithm you should observe a noticeable development at least in the lower right 3x3 block.

If you take one of the eigenvalues of the lower right 2x2 block and subtract it from every diagonal element, then you get the shifted QR algorithms as per Francis. The shift s, the number subtracted, is the current best estimate of the smallest eigenvalue. Note that quite probably, you now left the real domain and have to compute with complex matrix entries. You have to keep the shift s in memory and add any new shift in subsequent steps to it, and add the combined shift back to any eigenvalue found.

A matrix split occurs whenever any of the subdiagonal entries is practically zero. If the split occurs in the last row, then the last diagonal entry is an eigenvalue of the shifted matrix (A-s*I). If the split separates the last 2x2 block, then one can easily determine its eigenvalues which again are eigenvalues of the shifted matrix.

If the split happens anywhere else, the QR algorithm is recursively applied to the diagonal blocks separately.

  • Addendum I

Convergence of all variants of the algorithm is measured by the entries below the diagonal converging to zero.

The basic algorithm has geometric convergence in those sub-diagonal entries, the geometric factor of the entry at (i,i-1) is the fraction of the sizes of the eigenvalues at positions i-1 and i. A zero will be reached fast wherever there is a large jump.

Conjugate complex eigenvalues have the same size, so the algorithm will produce a 2x2 block on the diagonal, solve the quadratic characteristic equation of that block to get the corresponding eigenvalues.

Larger diagonal blocks will happen for multiple or clustered eigenvalues.

Addendum II

It is the line

   alpha = -cmp(x[0],0) * norm(x)

in the householder procedure. In most situations, x[0] will not be exactly 0, so this produces a sign. However, in the case of the companion matrix, x[0]==0 by construction, so no sign is produced, alpha gets the wrong value of 0. Change it to the more pedestrian

    alpha = norm(x)
    if x[0] < 0: alpha = -alpha

and it works well.

def companion_matrix(p):
    n=len(p)-1
    C=[[float(i+1 == j) for i in xrange(n-1)] for j in xrange(n)]
    for j in range(n): C[j].append(-p[n-j]/p[0])
    return C


def QR_roots(p):
    n=len(p)-1
    A=companion_matrix(p)
    for k in range(10+n):
        print "step: ",k+1," after ",(k+1)*(5+n), "iterations"
        for j in range(5+n):
            Q,R=householder(A)
            A=mult_matrix(R,Q)
        print("below diagonal")
        pprint([ A[i+1][i] for i in range(n-1) ])
        print("diagonal")
        pprint([ A[i][i] for i in range(n) ])
        print("above diagonal")
        pprint([ A[i][i+1] for i in range(n-1) ])


p=[ 1, 2, 5, 3, 6, 8, 6, 4, 3, 2, 7]
QR_roots(p)

#for a case with multiple roots at 1 and 4
#p= [1,-11,43,-73,56,-16]
#QR_roots(p)
Lutz Lehmann
  • 25,219
  • 2
  • 22
  • 51
  • Thanks a lot for the reply! Unfortunately, I'm totally after "you may use the QR decomposition". What are these "rotations" anyway? I guess I have to re-read the article again. The thing which causes greatest trouble for me is that the first linked QR decomposition does not develop at all. Why? However, the second linked QR decomp does! From my limited understanding, is it sufficient to do the `Q, R = decomp(A)` and `A = R x Q`? All the shifts and whatnot are there just to improve convergence rate and stability, right? What are the steps of a the very minimal procedure? – Ecir Hana Dec 11 '13 at 09:54
  • Yes, that is the basic algorithm. Can You check that QxR really results back in the original matrix A? Ignore the givens rotations for the moment, it would be more interesting if you could tell about the behavior of the lower right 3x3 block. [Removed the previous comment and added contents to the answer above.] – Lutz Lehmann Dec 11 '13 at 11:44
  • Yes, it results in the original A. But I guess that's because Q is identity matrix. In other words, when I do RxQ it always results in the original companion matrix (when I use the first link. With the second link it works, as I mentioned.) – Ecir Hana Dec 11 '13 at 12:52
  • 1
    See Addendum II, the code assumes that x[0] never has exactly the value 0, which is wrong in the case of companion matrices. – Lutz Lehmann Dec 11 '13 at 13:56
  • Awesome, it works! Thank you very much! Tiny note: when I use `if x[0] < 0: alpha = -alpha` it raises "no ordering relation is defined for complex numbers". I changed it to `if x[0].real < 0: alpha = -alpha` and it seems to work. Do you think it is the correct fix? – Ecir Hana Dec 12 '13 at 11:15
  • It results in a sign that is either +1 or -1, so it works. But: ... Householder reflections and QR-decompositions for complex matrices is a lot more complicated. Essentially, the sign to use is x[0]/abs(x[0]) with extra provisions for x=0. But check it if the resulting reflection exchanges x and e. – Lutz Lehmann Dec 12 '13 at 12:32
  • Btw., where does that `10+2*n` come from? Empirical tests? – Ecir Hana Dec 12 '13 at 17:44
  • Just to have anything growing with the matrix size. In trying it out it was too few, so I added an inner loop to see some progress between the outputs. No deep science behind the choices. – Lutz Lehmann Dec 12 '13 at 18:04
  • Hi! May I ask you one more thing? I have a problem with a polynomial `x^2-4` as it wont converge, the above and below diagonals just switch. I think it is the same problem (?) as I have here http://stackoverflow.com/questions/22261772/power-iteration . Is there anything I can do about this? I thought this matrix companion root finding thing will find all the roots all the time but now it seems that, for now at least, that it is quite sensitive... – Ecir Hana Mar 08 '14 at 10:28
  • The QR algorithm stops if the deflation by splitting in diagonal blocks ends with a 2x2 or 1x1 matrix. The 2x2 case is then solved by finding the characteristic polynomial and solving the characteristic equation. So it makes no sense to apply this method to quadratic polynomials. – Lutz Lehmann Mar 08 '14 at 10:48
  • Ok, thanks! So if I try `x^4-2x^+1` it also does not work but it is because the deflation splits the matrix into two 2x2 blocks again. Right? (Btw., in the "diagonal" printing there should be `A[i][i]`, instead of `A[1][i]`) – Ecir Hana Mar 08 '14 at 10:57
  • 1
    Thanks, corrected. As explained in the other thread, in the power iteration, the eigenspaces of eigenvalues of the same magnitude do not split, they all contribute to the iteration vector. A shift (by a small random number, or doing two iterations, once up by a small number and once down) of the matrix will separate the eigenspaces, but then the iteration vector will still wander around in the eigenspace of the largest eigenvalue, unless that has dimension 1. With multiplicity you need some Krylov space based method to identify the full eigenspace. – Lutz Lehmann Mar 08 '14 at 11:13