8

Bairstow's root finding method needs very good initial approximations for the quadratic factors in order to converge.

I tried various constants, random numbers, fractions out of the trailing coefficient (-a1/a2, -a0/a2; by Lin?) to no avail.

Please, does anyone know of a good method for choosing the factors?

For example:

1*x^8 + 118*x^7 + 1*x^6 + 2*x^5 - 2*x^4 - 3*x^3 + 3*x^2 + 2*x + 1

It take 3x as much time to find the root with the initial approximations 0.1, 0.2 than it does with 0.2, 2.0.

Or:

1*x^8 - 36*x^7 + 546*x^6 - 4536*x^5 + 22449*x^4 - 67284*x^3 + 118124*x^2 - 109584*x + 40320

takes slightly longer (~50%) with 0.1, 1.2 than with 0.1, 0.1


Trying to use Cauchy's bound for the initial quadratic approximation:

R=0
for i in range(1,n+1):
    R=max(abs(a[i]/a[0]),R)
R=1+R
phi=2*pi*random()
x1=complex(R*cos(phi),R*sin(phi))
x2=complex(x1.real,-x1.imag)
r=-x1.real-x2.real
s=(x1*x2).real

Unfortunately, this does not really speed-up the converge.

Ecir Hana
  • 10,864
  • 13
  • 67
  • 117

1 Answers1

3

As I've worked on the article and supplied the pictures I can tell that you really do not need that good approximations.

The most important initial step is to reduce the polynomial to even degree, as told in the article. After that, you can not do wrong, there should be almost global convergence. To be sure, the same as in Newton's method applies: If there is no noticeable sign of convergence after 10 steps, restart with a different initial point.

It is of course sensible to compute some outer root radius and choose the initial quadratic factor to have roots inside this radius.


See the javascript implementation in the source code of http://catc.ac.ir/mazlumi/jscodes/bairstow.php for a "naive" or "vanilla" but seemingly robust implementation. No reduction to even degree, no care for coefficient/root magnitudes,...


The example is efficiently an odd degree polynomial within the unit disk with one root -117.9917 at virtual infinity. For the initialization in every step one should compute the outer root radius, which is R=119 in the "1+max of abs of coefficients" (leading coefficient 1) version. Then initialize with x^2-R^2 or phi=2*pi*random(); and x^2+R^2*cos(phi)*x+R^2*sin(phi) or something similar.

Lutz Lehmann
  • 25,219
  • 2
  • 22
  • 51
  • The problem is I can clearly see that there is a huge difference in convergence speed when I use even slightly different initial approximations. I mean, surely some must be better than others? And when there are roots with multiplicity greater than one, it's even worse. Even for even-degree polynomials... – Ecir Hana Aug 25 '15 at 11:30
  • The first should not happen, at least not very often. Multiple roots will always be a problem. Of the usual algorithms, only Jenkins-Traub is relatively robust against that, it only suffers from cancellation errors in small divisors. Can you document in the question one - if possible small - instance for the first problem? – Lutz Lehmann Aug 25 '15 at 11:39
  • See the edit. Btw., when I plug the example into the "vanilla" implementation you linked to, it freezes. – Ecir Hana Aug 25 '15 at 11:52
  • But only every third or fourth time, indicating heavy reliance on initial conditions, which are random in the script. – Lutz Lehmann Aug 25 '15 at 12:03
  • Sorry, I didn't understand the last sentence: the second part is random point on circle with radius R but what is the first part? I mean, I need to numbers for quadratic factor guess, what is the `x` in `x^2-R^2`? – Ecir Hana Aug 25 '15 at 12:36
  • Also, if you would expand a bit on "efficiently an odd degree polynomial within the unit disk with one root -117.9917 at virtual infinity", that would be awesome. Because it clearly is 8th degree polynomial, i.e. even degree. – Ecir Hana Aug 25 '15 at 12:41
  • The polynomial has one large coefficient. In the original scale, it is close to `x^8+118*x^7`. This has one root `x=-118` and seven roots `x=0`. Under small perturbations this does not change much. For the small roots the leading term `x^8` is a small perturbation, the dominant binomial is the degree 7 polynomial `118*x^7+1`. Thus for quadratic factors with small roots this is essentially the odd-degree situation. Well-behaved if converging to complex solutions, diverging or slowly converging in the case of the real roots. – Lutz Lehmann Aug 25 '15 at 13:03
  • Wow, so interesting! Thanks for the explanation! And what the `x` in `x^2-R^2` presumably you meant a root? But I don't know any roots, did you mean some random x and split it to `p`, `q` so that `z^2+pz+q=x`? I apologize, I still don't quite understand how to find the initial factor. Lastly, not that it is a problem, just asking: if I start with R as upper bound, would I most probably deflate the polynomial in decreasing order (from largest to smallest roots)? – Ecir Hana Aug 25 '15 at 13:13
  • Yes, these were possibilities for a sensibly scaled initial quadratic factor to obtain a pair of the larger roots. However, for numerical stability one tries first for the smaller roots using some inner root radius. As you have seen in your examples, this inner radius should be more carefully estimated to avoid long iterations. – Lutz Lehmann Aug 25 '15 at 16:34
  • Please see the edit, is this what you were suggesting? – Ecir Hana Aug 26 '15 at 11:00
  • Start with `R=0` and after the loop add `R=1+R` to get the correct bound. (For the initialization, on does not need the correct bound, this estimate is in most cases much too large.) Perhaps it is even sufficient to always initialize to `z²+R²`, i.e., `[1,0,R*R]` without randomization. – Lutz Lehmann Aug 26 '15 at 11:09
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/87983/discussion-between-ecir-hana-and-lutzl). – Ecir Hana Aug 26 '15 at 11:28