Kantorovich theorem

Short description: About the convergence of Newton's method

The Kantorovich theorem, or Newton–Kantorovich theorem, is a mathematical statement on the semi-local convergence of Newton's method. It was first stated by Leonid Kantorovich in 1948.^[1]^[2] It is similar to the form of the Banach fixed-point theorem, although it states existence and uniqueness of a zero rather than a fixed point.^[3]

Newton's method constructs a sequence of points that under certain conditions will converge to a solution [math]\displaystyle{ x }[/math] of an equation [math]\displaystyle{ f(x)=0 }[/math] or a vector solution of a system of equation [math]\displaystyle{ F(x)=0 }[/math]. The Kantorovich theorem gives conditions on the initial point of this sequence. If those conditions are satisfied then a solution exists close to the initial point and the sequence converges to that point.^[1]^[2]

Assumptions

Let [math]\displaystyle{ X\subset\R^n }[/math] be an open subset and [math]\displaystyle{ F:X \subset \R^n \to\R^n }[/math] a differentiable function with a Jacobian [math]\displaystyle{ F^{\prime}(\mathbf x) }[/math] that is locally Lipschitz continuous (for instance if [math]\displaystyle{ F }[/math] is twice differentiable). That is, it is assumed that for any [math]\displaystyle{ x \in X }[/math] there is an open subset [math]\displaystyle{ U\subset X }[/math] such that [math]\displaystyle{ x \in U }[/math] and there exists a constant [math]\displaystyle{ L\gt 0 }[/math] such that for any [math]\displaystyle{ \mathbf x,\mathbf y\in U }[/math]

[math]\displaystyle{ \|F'(\mathbf x)-F'(\mathbf y)\|\le L\;\|\mathbf x-\mathbf y\| }[/math]

holds. The norm on the left is the operator norm. In other words, for any vector [math]\displaystyle{ \mathbf v\in\R^n }[/math] the inequality

[math]\displaystyle{ \|F'(\mathbf x)(\mathbf v)-F'(\mathbf y)(\mathbf v)\|\le L\;\|\mathbf x-\mathbf y\|\,\|\mathbf v\| }[/math]

must hold.

Now choose any initial point [math]\displaystyle{ \mathbf x_0\in X }[/math]. Assume that [math]\displaystyle{ F'(\mathbf x_0) }[/math] is invertible and construct the Newton step [math]\displaystyle{ \mathbf h_0=-F'(\mathbf x_0)^{-1}F(\mathbf x_0). }[/math]

The next assumption is that not only the next point [math]\displaystyle{ \mathbf x_1=\mathbf x_0+\mathbf h_0 }[/math] but the entire ball [math]\displaystyle{ B(\mathbf x_1,\|\mathbf h_0\|) }[/math] is contained inside the set [math]\displaystyle{ X }[/math]. Let [math]\displaystyle{ M }[/math] be the Lipschitz constant for the Jacobian over this ball (assuming it exists).

As a last preparation, construct recursively, as long as it is possible, the sequences [math]\displaystyle{ (\mathbf x_k)_k }[/math], [math]\displaystyle{ (\mathbf h_k)_k }[/math], [math]\displaystyle{ (\alpha_k)_k }[/math] according to

[math]\displaystyle{ \begin{alignat}{2} \mathbf h_k&=-F'(\mathbf x_k)^{-1}F(\mathbf x_k)\\[0.4em] \alpha_k&=M\,\|F'(\mathbf x_k)^{-1}\|\,\|\mathbf h_k\|\\[0.4em] \mathbf x_{k+1}&=\mathbf x_k+\mathbf h_k. \end{alignat} }[/math]

Statement

Now if [math]\displaystyle{ \alpha_0\le\tfrac12 }[/math] then

a solution [math]\displaystyle{ \mathbf x^* }[/math] of [math]\displaystyle{ F(\mathbf x^*)=0 }[/math] exists inside the closed ball [math]\displaystyle{ \bar B(\mathbf x_1,\|\mathbf h_0\|) }[/math] and
the Newton iteration starting in [math]\displaystyle{ \mathbf x_0 }[/math] converges to [math]\displaystyle{ \mathbf x^* }[/math] with at least linear order of convergence.

A statement that is more precise but slightly more difficult to prove uses the roots [math]\displaystyle{ t^\ast\le t^{**} }[/math] of the quadratic polynomial

[math]\displaystyle{ p(t) =\left(\tfrac12L\|F'(\mathbf x_0)^{-1}\|^{-1}\right)t^2 -t+\|\mathbf h_0\| }[/math],

[math]\displaystyle{ t^{\ast/**}=\frac{2\|\mathbf h_0\|}{1\pm\sqrt{1-2\alpha_0}} }[/math]

and their ratio

[math]\displaystyle{ \theta =\frac{t^*}{t^{**}} =\frac{1-\sqrt{1-2\alpha_0}}{1+\sqrt{1-2\alpha_0}}. }[/math]

Then

a solution [math]\displaystyle{ \mathbf x^* }[/math] exists inside the closed ball [math]\displaystyle{ \bar B(\mathbf x_1,\theta\|\mathbf h_0\|)\subset\bar B(\mathbf x_0,t^*) }[/math]
it is unique inside the bigger ball [math]\displaystyle{ B(\mathbf x_0,t^{*\ast}) }[/math]
and the convergence to the solution of [math]\displaystyle{ F }[/math] is dominated by the convergence of the Newton iteration of the quadratic polynomial [math]\displaystyle{ p(t) }[/math] towards its smallest root [math]\displaystyle{ t^\ast }[/math],^[4] if [math]\displaystyle{ t_0=0,\,t_{k+1}=t_k-\tfrac{p(t_k)}{p'(t_k)} }[/math], then
[math]\displaystyle{ \|\mathbf x_{k+p}-\mathbf x_k\|\le t_{k+p}-t_k. }[/math]
The quadratic convergence is obtained from the error estimate^[5]
[math]\displaystyle{ \|\mathbf x_{n+1}-\mathbf x^*\| \le \theta^{2^n}\|\mathbf x_{n+1}-\mathbf x_n\| \le\frac{\theta^{2^n}}{2^n}\|\mathbf h_0\|. }[/math]

Corollary

In 1986, Yamamoto proved that the error evaluations of the Newton method such as Doring (1969), Ostrowski (1971, 1973),^[6]^[7] Gragg-Tapia (1974), Potra-Ptak (1980),^[8] Miel (1981),^[9] Potra (1984),^[10] can be derived from the Kantorovich theorem.^[11]

Generalizations

There is a q-analog for the Kantorovich theorem.^[12]^[13] For other generalizations/variations, see Ortega & Rheinboldt (1970).^[14]

Applications

Oishi and Tanabe claimed that the Kantorovich theorem can be applied to obtain reliable solutions of linear programming.^[15]

References

↑ ^{Jump up to: 1.0} ^1.1 Deuflhard, P. (2004). Newton Methods for Nonlinear Problems. Affine Invariance and Adaptive Algorithms. Springer Series in Computational Mathematics. 35. Berlin: Springer. ISBN 3-540-21099-7.
↑ ^{Jump up to: 2.0} ^2.1 Zeidler, E. (1985). Nonlinear Functional Analysis and its Applications: Part 1: Fixed-Point Theorems. New York: Springer. ISBN 0-387-96499-1.
↑ Dennis, John E.; Schnabel, Robert B. (1983). "The Kantorovich and Contractive Mapping Theorems". Numerical Methods for Unconstrained Optimization and Nonlinear Equations. Englewood Cliffs: Prentice-Hall. pp. 92–94. ISBN 0-13-627216-9. https://books.google.com/books?id=ksvJTtJCx9cC&pg=PA92.
↑ Ortega, J. M. (1968). "The Newton-Kantorovich Theorem". Amer. Math. Monthly 75 (6): 658–660. doi:10.2307/2313800.
↑ Gragg, W. B.; Tapia, R. A. (1974). "Optimal Error Bounds for the Newton-Kantorovich Theorem". SIAM Journal on Numerical Analysis 11 (1): 10–13. doi:10.1137/0711002. Bibcode: 1974SJNA...11...10G.
↑ Ostrowski, A. M. (1971). "La method de Newton dans les espaces de Banach". C. R. Acad. Sci. Paris 27 (A): 1251–1253.
↑ Ostrowski, A. M. (1973). Solution of Equations in Euclidean and Banach Spaces. New York: Academic Press. ISBN 0-12-530260-6.
↑ Potra, F. A.; Ptak, V. (1980). "Sharp error bounds for Newton's process". Numer. Math. 34: 63–72. doi:10.1007/BF01463998.
↑ Miel, G. J. (1981). "An updated version of the Kantorovich theorem for Newton's method". Computing 27 (3): 237–244. doi:10.1007/BF02237981.
↑ Potra, F. A. (1984). "On the a posteriori error estimates for Newton's method". Beiträge zur Numerische Mathematik 12: 125–138.
↑ Yamamoto, T. (1986). "A method for finding sharp error bounds for Newton's method under the Kantorovich assumptions". Numerische Mathematik 49 (2–3): 203–220. doi:10.1007/BF01389624.
↑ Rajkovic, P. M.; Stankovic, M. S.; Marinkovic, S. D. (2003). "On q-iterative methods for solving equations and systems". Novi Sad J. Math 33 (2): 127–137.
↑ Rajković, P. M.; Marinković, S. D.; Stanković, M. S. (2005). "On q-Newton–Kantorovich method for solving systems of equations". Applied Mathematics and Computation 168 (2): 1432–1448. doi:10.1016/j.amc.2004.10.035.
↑ Ortega, J. M.; Rheinboldt, W. C. (1970). Iterative Solution of Nonlinear Equations in Several Variables. SIAM. OCLC 95021.
↑ Oishi, S.; Tanabe, K. (2009). "Numerical Inclusion of Optimum Point for Linear Programming". JSIAM Letters 1: 5–8. doi:10.14495/jsiaml.1.5.

Anonymous

Search

Kantorovich theorem

Namespaces

More

Page actions

Contents

Assumptions

Statement

Corollary

Generalizations

Applications

References

Further reading

Navigation

Navigation

Help

Translate

Wiki tools

Wiki tools

Anonymous

Search

Kantorovich theorem

Assumptions

Statement

Corollary

Generalizations

Applications

References

Further reading

Navigation

Wiki tools

Page tools

Other projects

Categories