Graeffe's method

From HandWiki
Revision as of 15:39, 6 February 2024 by Sherlock (talk | contribs) (add)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Short description: Algorithm for finding polynomial roots

In mathematics, Graeffe's method or Dandelin–Lobachesky–Graeffe method is an algorithm for finding all of the roots of a polynomial. It was developed independently by Germinal Pierre Dandelin in 1826 and Lobachevsky in 1834. In 1837 Karl Heinrich Gräffe also discovered the principal idea of the method.[1] The method separates the roots of a polynomial by squaring them repeatedly. This squaring of the roots is done implicitly, that is, only working on the coefficients of the polynomial. Finally, Viète's formulas are used in order to approximate the roots.

Dandelin–Graeffe iteration

Let p(x) be a polynomial of degree n

[math]\displaystyle{ p(x) = (x-x_1)\cdots(x-x_n). }[/math]

Then

[math]\displaystyle{ p(-x) = (-1)^n (x+x_1)\cdots(x+x_n). }[/math]

Let q(x) be the polynomial which has the squares [math]\displaystyle{ x_1^2, \cdots, x_n^2 }[/math] as its roots,

[math]\displaystyle{ q(x)= \left (x-x_1^2 \right )\cdots \left (x-x_n^2 \right ). }[/math]

Then we can write:

[math]\displaystyle{ \begin{align} q(x^2) & = \left (x^2-x_1^2 \right )\cdots \left (x^2-x_n^2 \right ) \\ & = (x-x_1)(x+x_1) \cdots (x-x_n) (x+x_n) \\ & = \left \{(x - x_1) \cdots (x - x_n) \right \} \times \left \{(x + x_1) \cdots (x + x_n) \right \} \\ & = p(x) \times \left \{(-1)^n (-x - x_1) \cdots (-x - x_n) \right \} \\ & = p(x) \times \left \{(-1)^n p(-x) \right \} \\ & = (-1)^n p(x) p(-x) \end{align} }[/math]

q(x) can now be computed by algebraic operations on the coefficients of the polynomial p(x) alone. Let:

[math]\displaystyle{ \begin{align} p(x) &= x^n+a_1x^{n-1}+\cdots+a_{n-1}x+a_n \\ q(x) &= x^n+b_1x^{n-1}+\cdots+b_{n-1}x+b_n \end{align} }[/math]

then the coefficients are related by

[math]\displaystyle{ b_k=(-1)^k a_k^2 + 2\sum_{j=0}^{k-1}(-1)^j\,a_ja_{2k-j}, \qquad a_0=b_0=1. }[/math]

Graeffe observed that if one separates p(x) into its odd and even parts:

[math]\displaystyle{ p(x)=p_e \left (x^2 \right )+x p_o\left (x^2 \right ), }[/math]

then one obtains a simplified algebraic expression for q(x):

[math]\displaystyle{ q(x)=(-1)^n \left (p_e(x)^2-x p_o(x)^2 \right ). }[/math]

This expression involves the squaring of two polynomials of only half the degree, and is therefore used in most implementations of the method.

Iterating this procedure several times separates the roots with respect to their magnitudes. Repeating k times gives a polynomial of degree n:

[math]\displaystyle{ q^k(y) = y^n + {a^k}_1\,y^{n-1} + \cdots + {a^k}_{n-1}\,y + {a^k}_n \, }[/math]

with roots

[math]\displaystyle{ y_1=x_1^{2^k},\,y_2=x_2^{2^k},\,\dots,\,y_n=x_n^{2^k}. }[/math]

If the magnitudes of the roots of the original polynomial were separated by some factor [math]\displaystyle{ \rho\gt 1 }[/math], that is, [math]\displaystyle{ |x_k|\ge\rho |x_{k+1}| }[/math], then the roots of the k-th iterate are separated by a fast growing factor

[math]\displaystyle{ \rho^{2^k}\ge 1+2^k(\rho-1) }[/math].

Classical Graeffe's method

Next the Vieta relations are used

[math]\displaystyle{ \begin{align} a^k_{\;1} &= -(y_1+y_2+\cdots+y_n)\\ a^k_{\;2} &= y_1 y_2 + y_1 y_3+\cdots+y_{n-1} y_n\\ &\;\vdots\\ a^k_{\;n} &= (-1)^n(y_1 y_2 \cdots y_n). \end{align} }[/math]

If the roots [math]\displaystyle{ x_1,\dots,x_n }[/math] are sufficiently separated, say by a factor [math]\displaystyle{ \rho\gt 1 }[/math], [math]\displaystyle{ |x_m|\ge \rho|x_{m+1}| }[/math], then the iterated powers [math]\displaystyle{ y_1,y_2,...,y_n }[/math] of the roots are separated by the factor [math]\displaystyle{ \rho^{2^k} }[/math], which quickly becomes very big.

The coefficients of the iterated polynomial can then be approximated by their leading term,

[math]\displaystyle{ a^k_{\;1} \approx -y_1 }[/math]
[math]\displaystyle{ a^k_{\;2} \approx y_1 y_2 }[/math] and so on,

implying

[math]\displaystyle{ y_1\approx -a^k_{\;1},\; y_2\approx -a^k_{\;2}/a^k_{\;1}, \;\dots\; y_n\approx -a^k_{\;n}/a^k_{\;n-1}. }[/math]

Finally, logarithms are used in order to find the absolute values of the roots of the original polynomial. These magnitudes alone are already useful to generate meaningful starting points for other root-finding methods.

To also obtain the angle of these roots, a multitude of methods has been proposed, the most simple one being to successively compute the square root of a (possibly complex) root of [math]\displaystyle{ q^m(y) }[/math], m ranging from k to 1, and testing which of the two sign variants is a root of [math]\displaystyle{ q^{m-1}(x) }[/math]. Before continuing to the roots of [math]\displaystyle{ q^{m-2}(x) }[/math], it might be necessary to numerically improve the accuracy of the root approximations for [math]\displaystyle{ q^{m-1}(x) }[/math], for instance by Newton's method.

Graeffe's method works best for polynomials with simple real roots, though it can be adapted for polynomials with complex roots and coefficients, and roots with higher multiplicity. For instance, it has been observed[2] that for a root [math]\displaystyle{ x_{\ell+1}=x_{\ell+2}=\dots=x_{\ell+d} }[/math] with multiplicity d, the fractions

[math]\displaystyle{ \left|\frac{(a^{m-1}_{\;\ell+i})^2}{a^{m}_{\;\ell+i}}\right| }[/math] tend to [math]\displaystyle{ \binom{d}{i} }[/math]

for [math]\displaystyle{ i=0,1,\dots,d }[/math]. This allows to estimate the multiplicity structure of the set of roots.

From a numerical point of view, this method is problematic since the coefficients of the iterated polynomials span very quickly many orders of magnitude, which implies serious numerical errors. One second, but minor concern is that many different polynomials lead to the same Graeffe iterates.

Tangential Graeffe method

This method replaces the numbers by truncated power series of degree 1, also known as dual numbers. Symbolically, this is achieved by introducing an "algebraic infinitesimal" [math]\displaystyle{ \varepsilon }[/math] with the defining property [math]\displaystyle{ \varepsilon^2=0 }[/math]. Then the polynomial [math]\displaystyle{ p(x+\varepsilon)=p(x)+\varepsilon\,p'(x) }[/math] has roots [math]\displaystyle{ x_m-\varepsilon }[/math], with powers

[math]\displaystyle{ (x_m-\varepsilon)^{2^k}=x_m^{2^k}-\varepsilon\,{2^k}\,x_m^{2^k-1}=y_m+\varepsilon\,\dot y_m. }[/math]

Thus the value of [math]\displaystyle{ x_m }[/math] is easily obtained as fraction [math]\displaystyle{ x_m=-\tfrac{2^k\,y_m}{\dot y_m}. }[/math]

This kind of computation with infinitesimals is easy to implement analogous to the computation with complex numbers. If one assumes complex coordinates or an initial shift by some randomly chosen complex number, then all roots of the polynomial will be distinct and consequently recoverable with the iteration.

Renormalization

Every polynomial can be scaled in domain and range such that in the resulting polynomial the first and the last coefficient have size one. If the size of the inner coefficients is bounded by M, then the size of the inner coefficients after one stage of the Graeffe iteration is bounded by [math]\displaystyle{ nM^2 }[/math]. After k stages one gets the bound [math]\displaystyle{ n^{2^k-1}M^{2^k} }[/math] for the inner coefficients.

To overcome the limit posed by the growth of the powers, Malajovich–Zubelli propose to represent coefficients and intermediate results in the kth stage of the algorithm by a scaled polar form

[math]\displaystyle{ c=\alpha\,e^{-2^k\,r}, }[/math]

where [math]\displaystyle{ \alpha=\frac{c}{|c|} }[/math] is a complex number of unit length and [math]\displaystyle{ r=-2^{-k}\log|c| }[/math] is a positive real. Splitting off the power [math]\displaystyle{ 2^k }[/math] in the exponent reduces the absolute value of c to the corresponding dyadic root. Since this preserves the magnitude of the (representation of the) initial coefficients, this process was named renormalization.

Multiplication of two numbers of this type is straightforward, whereas addition is performed following the factorization [math]\displaystyle{ c_3=c_1+c_2=|c_1|\cdot\left(\alpha_1+\alpha_2\tfrac{|c_2|}{|c_1|}\right) }[/math], where [math]\displaystyle{ c_1 }[/math] is chosen as the larger of both numbers, that is, [math]\displaystyle{ r_1\lt r_2 }[/math]. Thus

[math]\displaystyle{ \alpha_3=\tfrac{s}{|s|} }[/math] and [math]\displaystyle{ r_3=r_1+2^{-k}\,\log{|s|} }[/math] with [math]\displaystyle{ s=\alpha_1+\alpha_2\,e^{2^k(r_1-r_2)}. }[/math]

The coefficients [math]\displaystyle{ a_0,a_1,\dots,a_n }[/math] of the final stage k of the Graeffe iteration, for some reasonably large value of k, are represented by pairs [math]\displaystyle{ (\alpha_m,r_m) }[/math], [math]\displaystyle{ m=0,\dots,n }[/math]. By identifying the corners of the convex envelope of the point set [math]\displaystyle{ \{(m,r_m):\;m=0,\dots,n\} }[/math] one can determine the multiplicities of the roots of the polynomial. Combining this renormalization with the tangent iteration one can extract directly from the coefficients at the corners of the envelope the roots of the original polynomial.

See also

References

  1. Householder, Alston Scott (1959). "Dandelin, Lobačevskiǐ, or Graeffe". The American Mathematical Monthly 66 (6): 464–466. doi:10.2307/2310626. 
  2. Best, G.C. (1949). "Notes on the Graeffe Method of Root Squaring". The American Mathematical Monthly 56 (2): 91–94. doi:10.2307/2306166.