Berlekamp's root finding algorithm

From HandWiki
Short description: Method in number theory

In number theory, Berlekamp's root finding algorithm, also called the Berlekamp–Rabin algorithm, is the probabilistic method of finding roots of polynomials over a field [math]\displaystyle{ \mathbb Z_p }[/math]. The method was discovered by Elwyn Berlekamp in 1970[1] as an auxiliary to the algorithm for polynomial factorization over finite fields. The algorithm was later modified by Rabin for arbitrary finite fields in 1979.[2] The method was also independently discovered before Berlekamp by other researchers.[3]

History

The method was proposed by Elwyn Berlekamp in his 1970 work[1] on polynomial factorization over finite fields. His original work lacked a formal correctness proof[2] and was later refined and modified for arbitrary finite fields by Michael Rabin.[2] In 1986 René Peralta proposed a similar algorithm[4] for finding square roots in [math]\displaystyle{ \mathbb Z_p }[/math].[5] In 2000 Peralta's method was generalized for cubic equations.[6]

Statement of problem

Let [math]\displaystyle{ p }[/math] be an odd prime number. Consider the polynomial [math]\displaystyle{ f(x) = a_0 + a_1 x + \cdots + a_n x^n }[/math] over the field [math]\displaystyle{ \mathbb Z_p }[/math] of remainders modulo [math]\displaystyle{ p }[/math]. The algorithm should find all [math]\displaystyle{ \lambda }[/math] in [math]\displaystyle{ \mathbb Z_p }[/math] such that [math]\displaystyle{ f(\lambda)= 0 }[/math] in [math]\displaystyle{ \mathbb Z_p }[/math].[2][7]

Algorithm

Randomization

Let [math]\displaystyle{ f(x) = (x-\lambda_1)(x-\lambda_2)\cdots(x-\lambda_n) }[/math]. Finding all roots of this polynomial is equivalent to finding its factorization into linear factors. To find such factorization it is sufficient to split the polynomial into any two non-trivial divisors and factorize them recursively. To do this, consider the polynomial [math]\displaystyle{ f_z(x)=f(x-z) = (x-\lambda_1 - z)(x-\lambda_2 - z) \cdots (x-\lambda_n-z) }[/math] where [math]\displaystyle{ z }[/math] is some any element of [math]\displaystyle{ \mathbb Z_p }[/math]. If one can represent this polynomial as the product [math]\displaystyle{ f_z(x)=p_0(x)p_1(x) }[/math] then in terms of the initial polynomial it means that [math]\displaystyle{ f(x) =p_0(x+z)p_1(x+z) }[/math], which provides needed factorization of [math]\displaystyle{ f(x) }[/math].[1][7]

Classification of [math]\displaystyle{ \mathbb Z_p }[/math] elements

Due to Euler's criterion, for every monomial [math]\displaystyle{ (x-\lambda) }[/math] exactly one of following properties holds:[1]

  1. The monomial is equal to [math]\displaystyle{ x }[/math] if [math]\displaystyle{ \lambda = 0 }[/math],
  2. The monomial divides [math]\displaystyle{ g_0(x)=(x^{(p-1)/2}-1) }[/math] if [math]\displaystyle{ \lambda }[/math] is quadratic residue modulo [math]\displaystyle{ p }[/math],
  3. The monomial divides [math]\displaystyle{ g_1(x)=(x^{(p-1)/2}+1) }[/math] if [math]\displaystyle{ \lambda }[/math] is quadratic non-residual modulo [math]\displaystyle{ p }[/math].

Thus if [math]\displaystyle{ f_z(x) }[/math] is not divisible by [math]\displaystyle{ x }[/math], which may be checked separately, then [math]\displaystyle{ f_z(x) }[/math] is equal to the product of greatest common divisors [math]\displaystyle{ \gcd(f_z(x);g_0(x)) }[/math] and [math]\displaystyle{ \gcd(f_z(x);g_1(x)) }[/math].[7]

Berlekamp's method

The property above leads to the following algorithm:[1]

  1. Explicitly calculate coefficients of [math]\displaystyle{ f_z(x) = f(x-z) }[/math],
  2. Calculate remainders of [math]\displaystyle{ x,x^2, x^{2^2},x^{2^3}, x^{2^4}, \ldots, x^{2^{\lfloor \log_2 p \rfloor}} }[/math] modulo [math]\displaystyle{ f_z(x) }[/math] by squaring the current polynomial and taking remainder modulo [math]\displaystyle{ f_z(x) }[/math],
  3. Using exponentiation by squaring and polynomials calculated on the previous steps calculate the remainder of [math]\displaystyle{ x^{(p-1)/2} }[/math] modulo [math]\displaystyle{ f_z(x) }[/math],
  4. If [math]\displaystyle{ x^{(p-1)/2} \not \equiv \pm 1 \pmod{f_z(x)} }[/math] then [math]\displaystyle{ \gcd }[/math] mentioned above provide a non-trivial factorization of [math]\displaystyle{ f_z(x) }[/math],
  5. Otherwise all roots of [math]\displaystyle{ f_z(x) }[/math] are either residues or non-residues simultaneously and one has to choose another [math]\displaystyle{ z }[/math].

If [math]\displaystyle{ f(x) }[/math] is divisible by some non-linear primitive polynomial [math]\displaystyle{ g(x) }[/math] over [math]\displaystyle{ \mathbb Z_p }[/math] then when calculating [math]\displaystyle{ \gcd }[/math] with [math]\displaystyle{ g_0(x) }[/math] and [math]\displaystyle{ g_1(x) }[/math] one will obtain a non-trivial factorization of [math]\displaystyle{ f_z(x)/g_z(x) }[/math], thus algorithm allows to find all roots of arbitrary polynomials over [math]\displaystyle{ \mathbb Z_p }[/math].

Modular square root

Consider equation [math]\displaystyle{ x^2 \equiv a \pmod{p} }[/math] having elements [math]\displaystyle{ \beta }[/math] and [math]\displaystyle{ -\beta }[/math] as its roots. Solution of this equation is equivalent to factorization of polynomial [math]\displaystyle{ f(x) = x^2-a=(x-\beta)(x+\beta) }[/math] over [math]\displaystyle{ \mathbb Z_p }[/math]. In this particular case problem it is sufficient to calculate only [math]\displaystyle{ \gcd(f_z(x); g_0(x)) }[/math]. For this polynomial exactly one of the following properties will hold:

  1. GCD is equal to [math]\displaystyle{ 1 }[/math] which means that [math]\displaystyle{ z+\beta }[/math] and [math]\displaystyle{ z-\beta }[/math] are both quadratic non-residues,
  2. GCD is equal to [math]\displaystyle{ f_z(x) }[/math]which means that both numbers are quadratic residues,
  3. GCD is equal to [math]\displaystyle{ (x-t) }[/math]which means that exactly one of these numbers is quadratic residue.

In the third case GCD is equal to either [math]\displaystyle{ (x-z-\beta) }[/math] or [math]\displaystyle{ (x-z+\beta) }[/math]. It allows to write the solution as [math]\displaystyle{ \beta = (t - z) \pmod{p} }[/math].[1]

Example

Assume we need to solve the equation [math]\displaystyle{ x^2 \equiv 5\pmod{11} }[/math]. For this we need to factorize [math]\displaystyle{ f(x)=x^2-5=(x-\beta)(x+\beta) }[/math]. Consider some possible values of [math]\displaystyle{ z }[/math]:

  1. Let [math]\displaystyle{ z=3 }[/math]. Then [math]\displaystyle{ f_z(x) = (x-3)^2 - 5 = x^2 - 6x + 4 }[/math], thus [math]\displaystyle{ \gcd(x^2 - 6x + 4 ; x^5 - 1) = 1 }[/math]. Both numbers [math]\displaystyle{ 3 \pm \beta }[/math] are quadratic non-residues, so we need to take some other [math]\displaystyle{ z }[/math].
  1. Let [math]\displaystyle{ z=2 }[/math]. Then [math]\displaystyle{ f_z(x) = (x-2)^2 - 5 = x^2 - 4x - 1 }[/math], thus [math]\displaystyle{ \gcd( x^2 - 4x - 1 ; x^5 - 1)\equiv x - 9 \pmod{11} }[/math]. From this follows [math]\displaystyle{ x - 9 = x - 2 - \beta }[/math], so [math]\displaystyle{ \beta \equiv 7 \pmod{11} }[/math] and [math]\displaystyle{ -\beta \equiv -7 \equiv 4 \pmod{11} }[/math].

A manual check shows that, indeed, [math]\displaystyle{ 7^2 \equiv 49 \equiv 5\pmod{11} }[/math] and [math]\displaystyle{ 4^2\equiv 16 \equiv 5\pmod{11} }[/math].

Correctness proof

The algorithm finds factorization of [math]\displaystyle{ f_z(x) }[/math] in all cases except for ones when all numbers [math]\displaystyle{ z+\lambda_1, z+\lambda_2, \ldots, z+\lambda_n }[/math] are quadratic residues or non-residues simultaneously. According to theory of cyclotomy,[8] the probability of such an event for the case when [math]\displaystyle{ \lambda_1, \ldots, \lambda_n }[/math] are all residues or non-residues simultaneously (that is, when [math]\displaystyle{ z=0 }[/math] would fail) may be estimated as [math]\displaystyle{ 2^{-k} }[/math] where [math]\displaystyle{ k }[/math] is the number of distinct values in [math]\displaystyle{ \lambda_1, \ldots, \lambda_n }[/math].[1] In this way even for the worst case of [math]\displaystyle{ k=1 }[/math] and [math]\displaystyle{ f(x)=(x-\lambda)^n }[/math], the probability of error may be estimated as [math]\displaystyle{ 1/2 }[/math] and for modular square root case error probability is at most [math]\displaystyle{ 1/4 }[/math].

Complexity

Let a polynomial have degree [math]\displaystyle{ n }[/math]. We derive the algorithm's complexity as follows:

  1. Due to the binomial theorem [math]\displaystyle{ (x-z)^k = \sum\limits_{i=0}^k \binom{k}{i} (-z)^{k-i}x^i }[/math], we may transition from [math]\displaystyle{ f(x) }[/math] to [math]\displaystyle{ f(x-z) }[/math] in [math]\displaystyle{ O(n^2) }[/math] time.
  2. Polynomial multiplication and taking remainder of one polynomial modulo another one may be done in [math]\displaystyle{ O(n^2) }[/math], thus calculation of [math]\displaystyle{ x^{2^k} \bmod f_z(x) }[/math] is done in [math]\displaystyle{ O(n^2 \log p) }[/math].
  3. Binary exponentiation works in [math]\displaystyle{ O(n^2 \log p) }[/math].
  4. Taking the [math]\displaystyle{ \gcd }[/math] of two polynomials via Euclidean algorithm works in [math]\displaystyle{ O(n^2) }[/math].

Thus the whole procedure may be done in [math]\displaystyle{ O(n^2 \log p) }[/math]. Using the fast Fourier transform and Half-GCD algorithm,[9] the algorithm's complexity may be improved to [math]\displaystyle{ O(n \log n \log pn) }[/math]. For the modular square root case, the degree is [math]\displaystyle{ n = 2 }[/math], thus the whole complexity of algorithm in such case is bounded by [math]\displaystyle{ O(\log p) }[/math] per iteration.[7]

References

  1. 1.0 1.1 1.2 1.3 1.4 1.5 1.6 Berlekamp, E. R. (1970). "Factoring polynomials over large finite fields" (in en). Mathematics of Computation 24 (111): 713–735. doi:10.1090/S0025-5718-1970-0276200-X. ISSN 00255718. https://www.ams.org/mcom/1970-24-111/S0025-5718-1970-0276200-X/. 
  2. 2.0 2.1 2.2 2.3 M. Rabin (1980). "Probabilistic Algorithms in Finite Fields". SIAM Journal on Computing 9 (2): 273–280. doi:10.1137/0209024. ISSN 00975397. 
  3. Donald E Knuth (1998). The art of computer programming. Vol. 2 Vol. 2. ISBN 978-0201896848. OCLC 900627019. 
  4. Tsz-Wo Sze (2011). "On taking square roots without quadratic nonresidues over finite fields". Mathematics of Computation 80 (275): 1797–1811. doi:10.1090/s0025-5718-2011-02419-1. ISSN 00255718. 
  5. R. Peralta (November 1986). "A simple and fast probabilistic algorithm for computing square roots modulo a prime number (Corresp.)". IEEE Transactions on Information Theory 32 (6): 846–847. doi:10.1109/TIT.1986.1057236. ISSN 00189448. 
  6. C Padró, G Sáez (August 2002). "Taking cube roots in Zm". Applied Mathematics Letters 15 (6): 703–708. doi:10.1016/s0893-9659(02)00031-9. ISSN 08939659. 
  7. 7.0 7.1 7.2 7.3 Alfred J. Menezes, Ian F. Blake, XuHong Gao, Ronald C. Mullin, Scott A. Vanstone (1993). Applications of Finite Fields. The Springer International Series in Engineering and Computer Science. Springer US. ISBN 9780792392828. https://www.springer.com/gp/book/9780792392828. 
  8. Marshall Hall (1998). Combinatorial Theory. John Wiley & Sons. ISBN 9780471315186. https://books.google.com/?id=__JCiiCfu2EC&pg=PA1&dq=Combinatorial+Theory+hall#v=onepage&q=Combinatorial%20Theory%20hall&f=false. 
  9. Aho, Alfred V. (1974). The design and analysis of computer algorithms. Addison-Wesley Pub. Co. ISBN 0201000296. https://archive.org/details/designanalysisof00ahoarich.