Convergence proof techniques

From HandWiki

Convergence proof techniques are canonical components of mathematical proofs that sequences or functions converge to a finite limit when the argument tends to infinity. There are many types of series and modes of convergence requiring different techniques. Below are some of the more common examples. This article is intended as an introduction aimed to help practitioners explore appropriate techniques. The links below give details of necessary conditions and generalizations to more abstract settings. The convergence of series is already covered in the article on convergence tests.

Convergence in Rn

It is common to want to prove convergence of a sequence [math]\displaystyle{ f:\mathbb{N}\rightarrow \mathbb{R}^n }[/math] or function [math]\displaystyle{ f:\mathbb{R}\rightarrow \mathbb{R}^n }[/math], where [math]\displaystyle{ \mathbb{N} }[/math] and [math]\displaystyle{ \mathbb{R} }[/math] refer to the natural numbers and the real numbers, and convergence is with respect to the Euclidean norm, [math]\displaystyle{ ||\cdot||_2 }[/math].

Useful approaches for this are as follows.

First principles

The analytic definition of convergence of [math]\displaystyle{ f }[/math] to a limit [math]\displaystyle{ f_{\infty} }[/math] is that[1] for all [math]\displaystyle{ \epsilon }[/math] there exists a [math]\displaystyle{ k_0 }[/math] such for all [math]\displaystyle{ k \gt k_0 }[/math], [math]\displaystyle{ \|f(k) - f_{\infty}\| \lt \epsilon }[/math]. The most basic proof technique is to find such a [math]\displaystyle{ k_0 }[/math] and prove the required inequality. If the value of [math]\displaystyle{ f_{\infty} }[/math] is not known in advance, the techniques below may be useful.

Contraction mappings

In many cases, the function whose convergence is of interest has the form [math]\displaystyle{ f(k+1) = T(f(k)) }[/math] for some transformation [math]\displaystyle{ T }[/math]. For example, [math]\displaystyle{ T }[/math] could map [math]\displaystyle{ f(k) }[/math] to [math]\displaystyle{ f(k+1)=A f(k) }[/math] for some conformable matrix [math]\displaystyle{ A }[/math]. Alternatively, [math]\displaystyle{ T }[/math] may be an element-wise operation, such as replacing each element of [math]\displaystyle{ f(k) }[/math] by the square root of its magnitude.

In such cases, if the problem satisfies the conditions of Banach fixed-point theorem (the domain is a non-empty complete metric space) then it is sufficient to prove that [math]\displaystyle{ \|T(x) - T(y)\| \lt \|k(x - y)\| }[/math] for some constant [math]\displaystyle{ |k| \lt 1 }[/math] which is fixed for all [math]\displaystyle{ x }[/math] and [math]\displaystyle{ y }[/math]. Such a [math]\displaystyle{ T }[/math] is called a contraction mapping. The composition of two contraction mappings is a contraction mapping, so if [math]\displaystyle{ T = T_1 \circ T_2 }[/math], then it is sufficient to show that [math]\displaystyle{ T_1 }[/math] and [math]\displaystyle{ T_2 }[/math] are both contraction mappings.

Example

Famous example of the use of this approach include

  • If [math]\displaystyle{ T }[/math] has the form [math]\displaystyle{ T(x) = Ax + B }[/math] for some matrices [math]\displaystyle{ A }[/math] and [math]\displaystyle{ B }[/math], then convergence to [math]\displaystyle{ (I-A)^{-1}B }[/math] occurs if the magnitudes of all eigenvalues of [math]\displaystyle{ A }[/math] are less than 1[citation needed].

Non-expansion mappings

If both above inequalities are weak ([math]\displaystyle{ k \le 1 }[/math]), the mapping is a non-expansion mapping. It is not sufficient for [math]\displaystyle{ T }[/math] to be a non-expansion mapping. For example, [math]\displaystyle{ T(x) = -x }[/math] is a non-expansion mapping, but the sequence does not converge. However, the composition of a contraction mapping and a non-expansion mapping (or vice versa) is a contraction mapping.

Contraction mappings on limited domains

If [math]\displaystyle{ T }[/math] is not a contraction mapping on its entire domain, but it is on its codomain (the image of the domain), that is also sufficient for convergence. This also applies for decompositions. For example, consider [math]\displaystyle{ T(x) = \cos(\sin(x)) }[/math]. The function [math]\displaystyle{ \cos }[/math] is not a contraction mapping, but it is on the restricted domain [math]\displaystyle{ [-1, 1] }[/math], which is the codomain of [math]\displaystyle{ \sin }[/math] for real arguments. Since [math]\displaystyle{ \sin }[/math] is a non-expansion mapping, this implies [math]\displaystyle{ T }[/math] is a contraction mapping.

Convergent subsequences

Every bounded sequence in [math]\displaystyle{ \mathbb R^n }[/math] has a convergent subsequence, by the Bolzano–Weierstrass theorem. If these all have the same limit, then the original sequence converges to that limit. If it can be shown that all of the subsequences of [math]\displaystyle{ f }[/math] have the same limit, such as by showing that there is a unique fixed point of the transformation [math]\displaystyle{ T }[/math], then the initial sequence must also converge to that limit.

Monotonicity (Lyapunov functions)

Every bounded monotonic sequence in [math]\displaystyle{ \mathbb R^n }[/math] converges to a limit.

This approach can also be applied to sequences that are not monotonic. Instead, it is possible to define a function [math]\displaystyle{ V:\mathbb{R}^n\rightarrow \mathbb{R} }[/math] such that [math]\displaystyle{ V(f(n)) }[/math] is monotonic in [math]\displaystyle{ n }[/math]. If the [math]\displaystyle{ V }[/math] satisfies the conditions to be a Lyapunov function then [math]\displaystyle{ f }[/math] is convergent. Lyapunov's theorem is normally stated for ordinary differential equations, but can also be applied to sequences of iterates by replacing derivatives with discrete differences.

The basic requirements on [math]\displaystyle{ V }[/math] are that

  1. [math]\displaystyle{ V(f(n+1)) - V(f(n)) \lt 0 }[/math] for [math]\displaystyle{ f(n) \ne 0 }[/math] and [math]\displaystyle{ V(0) = 0 }[/math] (or [math]\displaystyle{ \dot{V}(x) \lt 0 }[/math] for [math]\displaystyle{ x \ne 0 }[/math])
  2. [math]\displaystyle{ V(x) \gt 0 }[/math] for all [math]\displaystyle{ x\ne 0 }[/math] and [math]\displaystyle{ V(0) = 0 }[/math]
  3. [math]\displaystyle{ V }[/math] be "radially unbounded", so that [math]\displaystyle{ V(x) }[/math] goes to infinity for any sequence with [math]\displaystyle{ \|x\| }[/math] that tends to infinity.

In many cases, a Lyapunov function of the form [math]\displaystyle{ V(x) = x^T A x }[/math] can be found, although more complex forms are also used.

For delay differential equations, a similar approach applies with Lyapunov functions replaced by Lyapunov functionals also called Lyapunov-Krasovskii functionals.

If the inequality in the condition 1 is weak, LaSalle's invariance principle may be used.

Convergence of sequences of functions

To consider the convergence of sequences of functions,[2] it is necessary to define a distance between functions to replace the Euclidean norm. These often include

  • Convergence in the norm (strong convergence) -- a function norm, such as [math]\displaystyle{ \|g\|_f = \int_{x \in A} \|g(x)\| dx }[/math] is defined, and convergence occurs if [math]\displaystyle{ ||f(n)-f_\infty||_f \rightarrow 0 }[/math]. For this case, all of the above techniques can be applied with this function norm.
  • Pointwise convergence -- convergence occurs if for each [math]\displaystyle{ x }[/math], [math]\displaystyle{ f_n(x) \rightarrow f_\infty(x) }[/math]. For this case, the above techniques can be applied for each point [math]\displaystyle{ x }[/math] with the norm appropriate for [math]\displaystyle{ f(x) }[/math].
  • uniform convergence -- In pointwise convergence, some (open) regions can converge arbitrarily slowly. With uniform convergence, there is a fixed convergence rate such that all points converge at least that fast. Formally, [math]\displaystyle{ \lim_{n\to\infty}\,\sup\{\,\left|f_n(x)-f_\infty(x)\right| : x \in A \,\}=0, }[/math] where [math]\displaystyle{ A }[/math] is the domain of each [math]\displaystyle{ f_n }[/math].

See also

Convergence of random variables

Random variables[3] are more complicated than simple elements of [math]\displaystyle{ \mathbb{R}^n }[/math]. (Formally, a random variable is a mapping [math]\displaystyle{ x:\Omega\rightarrow V }[/math] from an event space [math]\displaystyle{ \Omega }[/math] to a value space [math]\displaystyle{ V }[/math]. The value space may be [math]\displaystyle{ \mathbb{R}^n }[/math], such as the roll of a dice, and such a random variable is often spoken of informally as being in [math]\displaystyle{ \mathbb{R}^n }[/math], but convergence of sequence of random variables corresponds to convergence of the sequence of functions, or the distributions, rather than the sequence of values.)

There are multiple types of convergence, depending on the how the distance between functions is measured.

Each has its own proof techniques, which are beyond the current scope of this article.

See also

Topological convergence

For all of the above techniques, some form the basic analytic definition of convergence above applies. However, topology has its own definition of convergence. For example, in a non-hausdorff space, it is possible for a sequence to converge to multiple different limits.

References

  1. Ross, Kenneth. Elementary Analysis: The Theory of Calculus. Springer. 
  2. Haase, Markus. Functional Analysis: An Elementary Introduction. American Mathematics Society. 
  3. Billingsley, Patrick (1995). Probability and Measure. John Wesley.