Euler's factorization method

From HandWiki

Euler's factorization method is a technique for factoring a number by writing it as a sum of two squares in two different ways. For example the number [math]\displaystyle{ 1000009 }[/math] can be written as [math]\displaystyle{ 1000^2 + 3^2 }[/math] or as [math]\displaystyle{ 972^2 + 235^2 }[/math] and Euler's method gives the factorization [math]\displaystyle{ 1000009 = 293 \cdot 3413 }[/math]. The idea that two distinct representations of an odd positive integer may lead to a factorization was apparently first proposed by Marin Mersenne. However, it was not put to use extensively until one hundred years later by Euler. His most celebrated use of the method that now bears his name was to factor the number [math]\displaystyle{ 1000009 }[/math], which apparently was previously thought to be prime even though it is not a pseudoprime by any major primality test.

Euler's factorization method is more effective than Fermat's for integers whose factors are not close together and potentially much more efficient than trial division if one can find representations of numbers as sums of two squares reasonably easily. Euler's development ultimately permitted much more efficient factoring of numbers and, by the 1910s, the development of large factor tables going up to about ten million[citation needed]. The methods used to find representations of numbers as sums of two squares are essentially the same as with finding differences of squares in Fermat's factorization method.

Disadvantage and limitation

The great disadvantage of Euler's factorization method is that it cannot be applied to factoring an integer with any prime factor of the form 4k + 3 occurring to an odd power in its prime factorization, as such a number can never be the sum of two squares. Even odd composite numbers of the form 4k + 1 are often the product of two primes of the form 4k + 3 (e.g. 3053 = 43 × 71) and again cannot be factored by Euler's method.

This restricted applicability has made Euler's factorization method disfavoured for computer factoring algorithms, since any user attempting to factor a random integer is unlikely to know whether Euler's method can actually be applied to the integer in question. It is only relatively recently that there have been attempts to develop Euler's method into computer algorithms for use on specialised numbers where it is known Euler's method can be applied.

Theoretical basis

The Brahmagupta–Fibonacci identity states that the product of two sums of two squares is a sum of two squares. Euler's method relies on this theorem but it can be viewed as the converse, given [math]\displaystyle{ n = a^2 + b^2 = c^2 + d^2 }[/math] we find [math]\displaystyle{ n }[/math] as a product of sums of two squares.

First deduce that

[math]\displaystyle{ a^2 - c^2 = d^2 - b^2 }[/math]

and factor both sides to get

[math]\displaystyle{ (a-c)(a+c) = (d-b)(d+b) }[/math] (1)

Now let [math]\displaystyle{ k = \operatorname{gcd}(a-c,d-b) }[/math] and [math]\displaystyle{ h = \operatorname{gcd}(a+c,d+b) }[/math] so that there exists some constants [math]\displaystyle{ l,m,l',m' }[/math] satisfying

  • [math]\displaystyle{ (a-c) = kl }[/math],
  • [math]\displaystyle{ (d-b) = km }[/math],

[math]\displaystyle{ \operatorname{gcd}(l,m) = 1 }[/math]

  • [math]\displaystyle{ (a+c) = hm' }[/math],
  • [math]\displaystyle{ (d+b) = hl' }[/math],

[math]\displaystyle{ \operatorname{gcd}(l',m') = 1 }[/math]

Substituting these into equation (1) gives

[math]\displaystyle{ klhm' = kmhl' }[/math]

Canceling common factors yields

[math]\displaystyle{ lm' = l'm }[/math]

Now using the fact that [math]\displaystyle{ (l,m) }[/math] and [math]\displaystyle{ \left(l',m'\right) }[/math] are pairs of relatively prime numbers, we find that

  • [math]\displaystyle{ l = l' }[/math]
  • [math]\displaystyle{ m = m' }[/math]

So

  • [math]\displaystyle{ (a-c) = kl }[/math]
  • [math]\displaystyle{ (d-b) = km }[/math]
  • [math]\displaystyle{ (a+c) = hm }[/math]
  • [math]\displaystyle{ (d+b) = hl }[/math]

We now see that [math]\displaystyle{ m = \operatorname{gcd}(a+c,d-b) }[/math] and [math]\displaystyle{ l = \operatorname{gcd}(a-c,d+b) }[/math]

Applying the Brahmagupta–Fibonacci identity we get

[math]\displaystyle{ \left(k^2 + h^2\right)\left(l^2 + m^2\right) = (kl + hm)^2 + (km - hl)^2 = \bigl((a-c) + (a+c)\bigr)^2 + \bigl((d-b) - (d+b)\bigr)^2 = (2a)^2 + (2b)^2 = 4n. }[/math]

As each factor is a sum of two squares, one of these must contain both even numbers: either [math]\displaystyle{ (k, h) }[/math] or [math]\displaystyle{ (l ,m) }[/math]. Without loss of generality, assume that pair [math]\displaystyle{ (k,h) }[/math] is even. The factorization then becomes

[math]\displaystyle{ n = \left(\left(\tfrac{k}{2}\right)^2 + \left(\tfrac{h}{2}\right)^2\right)\left(l^2 + m^2\right). \, }[/math]

Worked example

Since: [math]\displaystyle{ \ 1000009 = 1000^2 + 3^2 = 972^2 + 235^2 }[/math]

we have from the formula above:

a = 1000 (A) ac = 28 k = gcd[A,C] = 4
b = 3 (B) a + c = 1972 h = gcd[B,D] = 34
c = 972 (C) db = 232 l = gcd[A,D] = 14
d = 235 (D) d + b = 238 m = gcd[B,C] = 116

Thus,

[math]\displaystyle{ 1000009 = \left[\left(\frac{4}{2}\right)^2 + \left(\frac{34}{2}\right)^2\right] \cdot \left[\left(\frac{14}{2}\right)^2 + \left(\frac{116}{2}\right)^2\right] \, }[/math]
[math]\displaystyle{ = \left(2^2 + 17^2\right) \cdot \left(7^2 + 58^2\right) \, }[/math]
[math]\displaystyle{ = (4 + 289) \cdot (49 + 3364) \, }[/math]
[math]\displaystyle{ = 293 \cdot 3413 \, }[/math]

Pseudocode

function Euler_factorize(int n) -> list[int]
   if is_prime(n) then
       print("Number is not factorable")
       exit function
   for-loop from a=1 to a=ceiling(sqrt(n))
       b2 = n - a*a
       b = floor(sqrt(b2))
       if b*b==b2
           break loop preserving a,b
   if a*a+b*b!=n then
       print("Failed to find any expression for n as sum of squares")
       exit function
   for-loop from c=a+1 to c=ceiling(sqrt(n))
       d2 = n - c*c
       d = floor(sqrt(d2))
       if d*d==d2 then
           break loop preserving c,d
   if c*c+d*d!=n then
       print("Failed to find a second expression for n as sum of squares")
       exit function
   A = c-a, B = c+a
   C = b-d, D = b+d 
   k = GCD(A,C)//2, h = GCD(B,D)//2
   l = GCD(A,D)//2, m = GCD(B,C)//2
   factor1 = k*k + h*h
   factor2 = l*l + m*m
   return list[ factor1, factor2 ]

References