Kantorovich theorem

From HandWiki
Short description: About the convergence of Newton's method

The Kantorovich theorem, or Newton–Kantorovich theorem, is a mathematical statement on the semi-local convergence of Newton's method. It was first stated by Leonid Kantorovich in 1948.[1][2] It is similar to the form of the Banach fixed-point theorem, although it states existence and uniqueness of a zero rather than a fixed point.[3]

Newton's method constructs a sequence of points that under certain conditions will converge to a solution [math]\displaystyle{ x }[/math] of an equation [math]\displaystyle{ f(x)=0 }[/math] or a vector solution of a system of equation [math]\displaystyle{ F(x)=0 }[/math]. The Kantorovich theorem gives conditions on the initial point of this sequence. If those conditions are satisfied then a solution exists close to the initial point and the sequence converges to that point.[1][2]

Assumptions

Let [math]\displaystyle{ X\subset\R^n }[/math] be an open subset and [math]\displaystyle{ F:X \subset \R^n \to\R^n }[/math] a differentiable function with a Jacobian [math]\displaystyle{ F^{\prime}(\mathbf x) }[/math] that is locally Lipschitz continuous (for instance if [math]\displaystyle{ F }[/math] is twice differentiable). That is, it is assumed that for any [math]\displaystyle{ x \in X }[/math] there is an open subset [math]\displaystyle{ U\subset X }[/math] such that [math]\displaystyle{ x \in U }[/math] and there exists a constant [math]\displaystyle{ L\gt 0 }[/math] such that for any [math]\displaystyle{ \mathbf x,\mathbf y\in U }[/math]

[math]\displaystyle{ \|F'(\mathbf x)-F'(\mathbf y)\|\le L\;\|\mathbf x-\mathbf y\| }[/math]

holds. The norm on the left is the operator norm. In other words, for any vector [math]\displaystyle{ \mathbf v\in\R^n }[/math] the inequality

[math]\displaystyle{ \|F'(\mathbf x)(\mathbf v)-F'(\mathbf y)(\mathbf v)\|\le L\;\|\mathbf x-\mathbf y\|\,\|\mathbf v\| }[/math]

must hold.

Now choose any initial point [math]\displaystyle{ \mathbf x_0\in X }[/math]. Assume that [math]\displaystyle{ F'(\mathbf x_0) }[/math] is invertible and construct the Newton step [math]\displaystyle{ \mathbf h_0=-F'(\mathbf x_0)^{-1}F(\mathbf x_0). }[/math]

The next assumption is that not only the next point [math]\displaystyle{ \mathbf x_1=\mathbf x_0+\mathbf h_0 }[/math] but the entire ball [math]\displaystyle{ B(\mathbf x_1,\|\mathbf h_0\|) }[/math] is contained inside the set [math]\displaystyle{ X }[/math]. Let [math]\displaystyle{ M }[/math] be the Lipschitz constant for the Jacobian over this ball (assuming it exists).

As a last preparation, construct recursively, as long as it is possible, the sequences [math]\displaystyle{ (\mathbf x_k)_k }[/math], [math]\displaystyle{ (\mathbf h_k)_k }[/math], [math]\displaystyle{ (\alpha_k)_k }[/math] according to

[math]\displaystyle{ \begin{alignat}{2} \mathbf h_k&=-F'(\mathbf x_k)^{-1}F(\mathbf x_k)\\[0.4em] \alpha_k&=M\,\|F'(\mathbf x_k)^{-1}\|\,\|\mathbf h_k\|\\[0.4em] \mathbf x_{k+1}&=\mathbf x_k+\mathbf h_k. \end{alignat} }[/math]

Statement

Now if [math]\displaystyle{ \alpha_0\le\tfrac12 }[/math] then

  1. a solution [math]\displaystyle{ \mathbf x^* }[/math] of [math]\displaystyle{ F(\mathbf x^*)=0 }[/math] exists inside the closed ball [math]\displaystyle{ \bar B(\mathbf x_1,\|\mathbf h_0\|) }[/math] and
  2. the Newton iteration starting in [math]\displaystyle{ \mathbf x_0 }[/math] converges to [math]\displaystyle{ \mathbf x^* }[/math] with at least linear order of convergence.

A statement that is more precise but slightly more difficult to prove uses the roots [math]\displaystyle{ t^\ast\le t^{**} }[/math] of the quadratic polynomial

[math]\displaystyle{ p(t) =\left(\tfrac12L\|F'(\mathbf x_0)^{-1}\|^{-1}\right)t^2 -t+\|\mathbf h_0\| }[/math],
[math]\displaystyle{ t^{\ast/**}=\frac{2\|\mathbf h_0\|}{1\pm\sqrt{1-2\alpha_0}} }[/math]

and their ratio

[math]\displaystyle{ \theta =\frac{t^*}{t^{**}} =\frac{1-\sqrt{1-2\alpha_0}}{1+\sqrt{1-2\alpha_0}}. }[/math]

Then

  1. a solution [math]\displaystyle{ \mathbf x^* }[/math] exists inside the closed ball [math]\displaystyle{ \bar B(\mathbf x_1,\theta\|\mathbf h_0\|)\subset\bar B(\mathbf x_0,t^*) }[/math]
  2. it is unique inside the bigger ball [math]\displaystyle{ B(\mathbf x_0,t^{*\ast}) }[/math]
  3. and the convergence to the solution of [math]\displaystyle{ F }[/math] is dominated by the convergence of the Newton iteration of the quadratic polynomial [math]\displaystyle{ p(t) }[/math] towards its smallest root [math]\displaystyle{ t^\ast }[/math],[4] if [math]\displaystyle{ t_0=0,\,t_{k+1}=t_k-\tfrac{p(t_k)}{p'(t_k)} }[/math], then
    [math]\displaystyle{ \|\mathbf x_{k+p}-\mathbf x_k\|\le t_{k+p}-t_k. }[/math]
  4. The quadratic convergence is obtained from the error estimate[5]
    [math]\displaystyle{ \|\mathbf x_{n+1}-\mathbf x^*\| \le \theta^{2^n}\|\mathbf x_{n+1}-\mathbf x_n\| \le\frac{\theta^{2^n}}{2^n}\|\mathbf h_0\|. }[/math]

Corollary

In 1986, Yamamoto proved that the error evaluations of the Newton method such as Doring (1969), Ostrowski (1971, 1973),[6][7] Gragg-Tapia (1974), Potra-Ptak (1980),[8] Miel (1981),[9] Potra (1984),[10] can be derived from the Kantorovich theorem.[11]

Generalizations

There is a q-analog for the Kantorovich theorem.[12][13] For other generalizations/variations, see Ortega & Rheinboldt (1970).[14]

Applications

Oishi and Tanabe claimed that the Kantorovich theorem can be applied to obtain reliable solutions of linear programming.[15]

References

  1. 1.0 1.1 Deuflhard, P. (2004). Newton Methods for Nonlinear Problems. Affine Invariance and Adaptive Algorithms. Springer Series in Computational Mathematics. 35. Berlin: Springer. ISBN 3-540-21099-7. 
  2. 2.0 2.1 Zeidler, E. (1985). Nonlinear Functional Analysis and its Applications: Part 1: Fixed-Point Theorems. New York: Springer. ISBN 0-387-96499-1. 
  3. Dennis, John E.; Schnabel, Robert B. (1983). "The Kantorovich and Contractive Mapping Theorems". Numerical Methods for Unconstrained Optimization and Nonlinear Equations. Englewood Cliffs: Prentice-Hall. pp. 92–94. ISBN 0-13-627216-9. https://books.google.com/books?id=ksvJTtJCx9cC&pg=PA92. 
  4. Ortega, J. M. (1968). "The Newton-Kantorovich Theorem". Amer. Math. Monthly 75 (6): 658–660. doi:10.2307/2313800. 
  5. Gragg, W. B.; Tapia, R. A. (1974). "Optimal Error Bounds for the Newton-Kantorovich Theorem". SIAM Journal on Numerical Analysis 11 (1): 10–13. doi:10.1137/0711002. Bibcode1974SJNA...11...10G. 
  6. Ostrowski, A. M. (1971). "La method de Newton dans les espaces de Banach". C. R. Acad. Sci. Paris 27 (A): 1251–1253. 
  7. Ostrowski, A. M. (1973). Solution of Equations in Euclidean and Banach Spaces. New York: Academic Press. ISBN 0-12-530260-6. 
  8. Potra, F. A.; Ptak, V. (1980). "Sharp error bounds for Newton's process". Numer. Math. 34: 63–72. doi:10.1007/BF01463998. 
  9. Miel, G. J. (1981). "An updated version of the Kantorovich theorem for Newton's method". Computing 27 (3): 237–244. doi:10.1007/BF02237981. 
  10. Potra, F. A. (1984). "On the a posteriori error estimates for Newton's method". Beiträge zur Numerische Mathematik 12: 125–138. 
  11. Yamamoto, T. (1986). "A method for finding sharp error bounds for Newton's method under the Kantorovich assumptions". Numerische Mathematik 49 (2–3): 203–220. doi:10.1007/BF01389624. 
  12. Rajkovic, P. M.; Stankovic, M. S.; Marinkovic, S. D. (2003). "On q-iterative methods for solving equations and systems". Novi Sad J. Math 33 (2): 127–137. 
  13. Rajković, P. M.; Marinković, S. D.; Stanković, M. S. (2005). "On q-Newton–Kantorovich method for solving systems of equations". Applied Mathematics and Computation 168 (2): 1432–1448. doi:10.1016/j.amc.2004.10.035. 
  14. Ortega, J. M.; Rheinboldt, W. C. (1970). Iterative Solution of Nonlinear Equations in Several Variables. SIAM. OCLC 95021. 
  15. Oishi, S.; Tanabe, K. (2009). "Numerical Inclusion of Optimum Point for Linear Programming". JSIAM Letters 1: 5–8. doi:10.14495/jsiaml.1.5. 

Further reading