Budan's theorem

From HandWiki
Short description: Upper bound and parity of the number of real roots of a polynomial in an interval

In mathematics, Budan's theorem is a theorem for bounding the number of real roots of a polynomial in an interval, and computing the parity of this number. It was published in 1807 by François Budan de Boislaurent.

A similar theorem was published independently by Joseph Fourier in 1820. Each of these theorems is a corollary of the other. Fourier's statement appears more often in the literature of the 19th century and has been referred to as Fourier's, Budan–Fourier, Fourier–Budan, and even Budan's theorem.

Budan's original formulation is used in fast modern algorithms for real-root isolation of polynomials.

Sign variation

Let [math]\displaystyle{ c_0, c_1, c_2, \ldots c_k }[/math] be a finite sequence of real numbers. A sign variation or sign change in the sequence is a pair of indices i < j such that [math]\displaystyle{ c_ic_j \lt 0, }[/math] and either j = i + 1 or [math]\displaystyle{ c_k = 0 }[/math] for all k such that i < k < j.

In other words, a sign variation occurs in the sequence at each place where the signs change, when ignoring zeros.

For studying the real roots of a polynomial, the number of sign variations of several sequences may be used. For Budan's theorem, it is the sequence of the coefficients. For the Fourier's theorem, it is the sequence of values of the successive derivatives at a point. For Sturm's theorem it is the sequence of values at a point of the Sturm sequence.

Descartes' rule of signs

Main page: Descartes' rule of signs

All results described in this article are based on Descartes' rule of signs.

If p(x) is a univariate polynomial with real coefficients, let us denote by #+(p) the number of its positive real roots, counted with their multiplicity,[1] and by v(p) the number of sign variations in the sequence of its coefficients. Descartes's rule of signs asserts that

v(p) – #+(p) is a nonnegative even integer.

In particular, if v(p) ≤ 1, then one has #+(p) = v(p).

Budan's statement

Given a univariate polynomial p(x) with real coefficients, let us denote by #(,r](p) the number of real roots, counted with their multiplicities,[1] of p in a half-open interval (, r] (with < r real numbers). Let us denote also by vh(p) the number of sign variations in the sequence of the coefficients of the polynomial ph(x) = p(x + h). In particular, one has v(p) = v0(p) with the notation of the preceding section.

Budan's theorem is the following:

[math]\displaystyle{ v_\ell(p)-v_r(p)-\#_{(\ell,r]} }[/math] is a nonnegative even integer.

As [math]\displaystyle{ \#_{(\ell,r]} }[/math] is non negative, this implies [math]\displaystyle{ v_\ell(p)\ge v_r(p). }[/math]

This is a generalization of Descartes' rule of signs, as, if one chooses r sufficiently large, it is larger than all real roots of p, and all the coefficients of [math]\displaystyle{ p_r(x) }[/math] are positive, that is [math]\displaystyle{ v_r(p)=0. }[/math] Thus [math]\displaystyle{ v_0(p)= v_0(p)- v_r(p), }[/math] and [math]\displaystyle{ \#_+ = \#_{(0,r)}, }[/math] which makes Descartes' rule of signs a special case of Budan's theorem.

As for Descartes' rule of signs, if [math]\displaystyle{ v_\ell(p)-v_r(p)\le 1, }[/math] one has [math]\displaystyle{ \#_{(\ell,r]}=v_\ell(p)-v_r(p). }[/math] This means that, if [math]\displaystyle{ v_\ell(p)-v_r(p)\le 1 }[/math] one has a "zero-root test" and a "one-root test".

Examples

1. Given the polynomial [math]\displaystyle{ p(x)=x^3 -7x + 7, }[/math] and the open interval [math]\displaystyle{ (0,2) }[/math], one has

[math]\displaystyle{ \begin{align}p(x+0)&=p(x)=x^3 -7x + 7\\ p(x+2)&=(x+2)^3 -7(x+2) + 7=x^3+6x^2+5x+1 \end{align}. }[/math]

Thus, [math]\displaystyle{ v_0(p)-v_2(p)= 2-0=2, }[/math] and Budan's theorem asserts that the polynomial [math]\displaystyle{ p(x) }[/math] has either two or zero real roots in the open interval [math]\displaystyle{ (0,2). }[/math]

2. With the same polynomial [math]\displaystyle{ p(x)=x^3 -7x + 7 }[/math] one has

[math]\displaystyle{ p(x+1)=(x+1)^3 -7(x+1) + 7=x^3+3x^2-4x+1. }[/math]

Thus, [math]\displaystyle{ v_0(p)-v_1(p)= 2-2=0, }[/math] and Budan's theorem asserts that the polynomial [math]\displaystyle{ p(x) }[/math] has no real root in the open interval [math]\displaystyle{ (0,1). }[/math] This is an example of the use of Budan's theorem as a zero-root test.

Fourier's statement

Fourier's theorem on polynomial real roots, also called Fourier–Budan theorem or Budan–Fourier theorem (sometimes just Budan's theorem) is exactly the same as Budan's theorem, except that, for h = l and r, the sequence of the coefficients of p(x + h) is replaced by the sequence of the derivatives of p at h.

Each theorem is a corollary of the other. This results from the Taylor expansion

[math]\displaystyle{ p(x)=\sum_{i=0}^{\deg p} \frac {p^{(i)}(h)}{i!} (x-h)^i }[/math]

of the polynomial p at h, which implies that the coefficient of xi in p(x + h) is the quotient of [math]\displaystyle{ p^{(i)}(h) }[/math] by i!, a positive number. Thus the sequences considered in Fourier's theorem and in Budan's theorem have the same number of sign variations.

This strong relationship between the two theorems may explain the priority controversy that occurred in 19th century, and the use of several names for the same theorem. In modern usage, for computer computation, Budan's theorem is generally preferred since the sequences have much larger coefficients in Fourier's theorem than in Budan's, because of the factorial factor.

Proof

As each theorem is a corollary of the other, it suffices to prove Fourier's theorem.

Proof:

Let [math]\displaystyle{ n }[/math] be the degree of [math]\displaystyle{ f }[/math], so that [math]\displaystyle{ f, f', ..., f^{(n-1)} }[/math] are nonconstant polynomials, [math]\displaystyle{ f^{(n)} }[/math] is a nonzero constant, and [math]\displaystyle{ f^{(n+1)}, ... }[/math] are all identically zero.

As a function of [math]\displaystyle{ t, }[/math] the sign variation [math]\displaystyle{ v_t(f) }[/math] can only varies at a root of at least one of [math]\displaystyle{ f, f', ..., f^{(n-1)}. }[/math]

If [math]\displaystyle{ v_t(f) }[/math] varies at [math]\displaystyle{ t=r }[/math], then for some [math]\displaystyle{ k }[/math], [math]\displaystyle{ f^{(k)}(x) }[/math] has a root at [math]\displaystyle{ t }[/math], and each of [math]\displaystyle{ f, f', ..., f^{(k-1)} }[/math] has no root at [math]\displaystyle{ t }[/math].

If [math]\displaystyle{ k=0 }[/math], then [math]\displaystyle{ f(x) = (x-r)^{s}p(x-r) }[/math] for some [math]\displaystyle{ s \geq 1 }[/math] and some polynomial [math]\displaystyle{ p }[/math] that satisfies [math]\displaystyle{ p(0) \neq 0 }[/math]. By explicitly computing [math]\displaystyle{ f, f', ..., f^{(n)} }[/math] at [math]\displaystyle{ r }[/math] and [math]\displaystyle{ r-\epsilon }[/math] for a small [math]\displaystyle{ \epsilon }[/math], we have [math]\displaystyle{ v_r(f) = v_{r-\epsilon}(f) - s- 2s', \quad \exists s' \geq 0. }[/math]

In this equation, the term [math]\displaystyle{ -s }[/math] is due to the signs of [math]\displaystyle{ f, f', ..., f^{(s)} }[/math] changing from [math]\displaystyle{ (-1)^s\operatorname{sign}(p(0)), (-1)^{s-1}\operatorname{sign}(p(0)), ..., -\operatorname{sign}(p(0)), \operatorname{sign}(p(0)) }[/math] to [math]\displaystyle{ 0, 0, ..., 0, \operatorname{sign}(p(0)) }[/math]. The term [math]\displaystyle{ - 2s', \quad \exists s' \geq 0 }[/math] is due to the higher derivative signs possibly becoming zero.

If [math]\displaystyle{ k \geq 1 }[/math], then since some derivatives are zeroed at [math]\displaystyle{ r }[/math], but both [math]\displaystyle{ f^{(k-1)}(x) }[/math] and [math]\displaystyle{ f^{(n)}(x) }[/math] remain nonzero, we only lose an even number of sign changes:

[math]\displaystyle{ v_r(f) = v_{r-\epsilon}(f) - 2s', \quad \exists s' \geq 0 }[/math]

If [math]\displaystyle{ v_t(f) }[/math] varies at [math]\displaystyle{ t=l }[/math], then arguing similarly, we find that for both cases, we can take a small [math]\displaystyle{ \epsilon }[/math] such that [math]\displaystyle{ v_{l+\epsilon}(f) = v_l(f) }[/math].

History

The problem of counting and locating the real roots of a polynomial started to be systematically studied only in the beginning of the 19th century.

In 1807, François Budan de Boislaurent discovered a method for extending Descartes' rule of signs—valid for the interval (0, +∞)—to any interval.[2]

Joseph Fourier published a similar theorem in 1820,[3] on which he worked for more than twenty years.[4]

Because of the similarity between the two theorems, there was a priority controversy,[5][6] despite the fact that the two theorems were discovered independently.[4] It was generally Fourier's formulation and proof that were used, during the 19th century, in textbooks on the theory of equations.

Use in 19th century

Budan's and Fourier's theorems were soon considered of a great importance, although they do not solve completely the problem of counting the number of real roots of a polynomial in an interval. This problem was completely solved in 1827 by Sturm.

Although Sturm's theorem is not based on Descartes' rule of signs, Sturm's and Fourier's theorems are related not only by the use of the number of sign variations of a sequence of numbers, but also by a similar approach of the problem. Sturm himself acknowledged having been inspired by Fourier's methods:[7] « C'est en m'appuyant sur les principes qu'il a posés, et en imitant ses démonstrations, que j'ai trouvé les nouveaux théorèmes que je vais énoncer. » which translates into « It is by relying upon the principles he has laid out and by imitating his proofs that I have found the new theorems which I am about to present. »

Because of this, during the 19th century, Fourier's and Sturm's theorems appeared together in almost all books on the theory of equations.

Fourier and Budan left open the problem of reducing the size of the intervals in which roots are searched in a way that, eventually, the difference between the numbers of sign variations is at most one, allowing certifying that the final intervals contains at most one root each. This problem was solved in 1834 by Alexandre Joseph Hidulph Vincent.[8] Roughly speaking, Vincent's theorem consists of using continued fractions for replacing Budan's linear transformations of the variable by Möbius transformations.

Budan's, Fourier's and Vincent theorem sank into oblivion at the end of 19th century. The last author mentioning these theorems before the second half of 20th century Joseph Alfred Serret.[9] They were introduced again in 1976 by Collins and Akritas, for providing, in computer algebra, an efficient algorithm for real roots isolation on computers.[10]

See also

References

  1. 1.0 1.1 This means that a root of multiplicity m is counted as m roots.
  2. Budan, François D. (1807). Nouvelle méthode pour la résolution des équations numériques. Paris: Courcier. https://books.google.com/books?id=VyMOAAAAQAAJ. 
  3. Fourier, Jean Baptiste Joseph (1820). "Sur l'usage du théorème de Descartes dans la recherche des limites des racines". Bulletin des Sciences, par la Société Philomatique de Paris: 156–165. https://archive.org/details/bulletindesscien20soci. 
  4. 4.0 4.1 Arago, François (1859), Biographies of distinguished scientific men, Boston: Ticknor and Fields (English Translation), p. 383, https://books.google.com/books?id=xGgSAAAAIAAJ 
  5. Akritas, Alkiviadis G. (1981). "On the Budan–Fourier Controversy". ACM SIGSAM Bulletin 15 (1): 8–10. doi:10.1145/1089242.1089243. 
  6. Akritas, Alkiviadis G. (1982). "Reflections on a Pair of Theorems by Budan and Fourier". Mathematics Magazine 55 (5): 292–298. doi:10.2307/2690097. 
  7. Benis-Sinaceur, Hourya (1988). "Deux moments dans l'histoire du Théorème d'algèbre de Ch. F. Sturm". Revue d'Histoire des Sciences 41 (2): 99–132 (p. 108). doi:10.3406/rhs.1988.4093. https://halshs.archives-ouvertes.fr/halshs-01119574/file/1988_RHS_vol41_n2_p99_deuxmomentsdans.pdf. 
  8. Vincent, Alexandre Joseph Hidulph (1834). "Mémoire sur la résolution des équations numériques". Mémoires de la Société Royale des Sciences, de l' Agriculture et des Arts, de Lille: 1–34. http://gallica.bnf.fr/ark:/12148/bpt6k57787134/f4.image.r=Agence%20Rol.langEN. 
  9. Serret, Joseph A. (1877). Cours d'algèbre supérieure. Tome I. Gauthier-Villars. pp. 363–368. https://archive.org/details/coursdalgbresu01serruoft. 
  10. Collins, G. E.; Akritas, A. G. (1976). "Polynomial real root isolation using Descarte's rule of signs". Proceedings of the 1976 ACM symposium on Symbolic and Algebraic Computation. pp. 272–275. doi:10.1145/800205.806346. https://doi.org/10.1145/800205.806346. 

External links

O'Connor, John J.; Robertson, Edmund F., "Budan de Boislaurent", MacTutor History of Mathematics archive, University of St Andrews, http://www-history.mcs.st-andrews.ac.uk/Biographies/{{{id}}}.html .