Hilbert projection theorem

From HandWiki
Short description: On closed convex subsets in Hilbert space

In mathematics, the Hilbert projection theorem is a famous result of convex analysis that says that for every vector [math]\displaystyle{ x }[/math] in a Hilbert space [math]\displaystyle{ H }[/math] and every nonempty closed convex [math]\displaystyle{ C \subseteq H, }[/math] there exists a unique vector [math]\displaystyle{ m \in C }[/math] for which [math]\displaystyle{ \|c - x\| }[/math] is minimized over the vectors [math]\displaystyle{ c \in C }[/math]; that is, such that [math]\displaystyle{ \|m - x\| \leq \|c - x\| }[/math] for every [math]\displaystyle{ c \in C. }[/math]

Finite dimensional case

Some intuition for the theorem can be obtained by considering the first order condition of the optimization problem.

Consider a finite dimensional real Hilbert space [math]\displaystyle{ H }[/math] with a subspace [math]\displaystyle{ C }[/math] and a point [math]\displaystyle{ x. }[/math] If [math]\displaystyle{ m \in C }[/math] is a minimizer or minimum point of the function [math]\displaystyle{ N : C \to \R }[/math] defined by [math]\displaystyle{ N(c) := \|c - x\| }[/math] (which is the same as the minimum point of [math]\displaystyle{ c \mapsto \|c - x\|^2 }[/math]), then derivative must be zero at [math]\displaystyle{ m. }[/math]

In matrix derivative notation[1] [math]\displaystyle{ \begin{aligned} \partial \lVert x - c \rVert^2 &= \partial \langle c - x, c - x \rangle \\ &= 2 \langle c - x, \partial c\rangle \end{aligned} }[/math] Since [math]\displaystyle{ \partial c }[/math] is a vector in [math]\displaystyle{ C }[/math] that represents an arbitrary tangent direction, it follows that [math]\displaystyle{ m - x }[/math] must be orthogonal to every vector in [math]\displaystyle{ C. }[/math]

Statement

Hilbert projection theorem — For every vector [math]\displaystyle{ x }[/math] in a Hilbert space [math]\displaystyle{ H }[/math] and every nonempty closed convex [math]\displaystyle{ C \subseteq H, }[/math] there exists a unique vector [math]\displaystyle{ m \in C }[/math] for which [math]\displaystyle{ \lVert x - m \rVert }[/math] is equal to [math]\displaystyle{ \delta := \inf_{c \in C} \|x - c\|. }[/math]

If the closed subset [math]\displaystyle{ C }[/math] is also a vector subspace of [math]\displaystyle{ H }[/math] then this minimizer [math]\displaystyle{ m }[/math] is the unique element in [math]\displaystyle{ C }[/math] such that [math]\displaystyle{ x - m }[/math] is orthogonal to [math]\displaystyle{ C. }[/math]

Detailed elementary proof

Proof by reduction to a special case

It suffices to prove the theorem in the case of [math]\displaystyle{ x = 0 }[/math] because the general case follows from the statement below by replacing [math]\displaystyle{ C }[/math] with [math]\displaystyle{ C - x. }[/math]

Hilbert projection theorem (case [math]\displaystyle{ x = 0 }[/math])[2] — For every nonempty closed convex subset [math]\displaystyle{ C \subseteq H }[/math] of a Hilbert space [math]\displaystyle{ H, }[/math] there exists a unique vector [math]\displaystyle{ m \in C }[/math] such that [math]\displaystyle{ \inf_{c \in C} \| c \| = \| m \|. }[/math]

Furthermore, letting [math]\displaystyle{ d := \inf_{c \in C} \| c \|, }[/math] if [math]\displaystyle{ \left(c_n\right)_{n=1}^{\infty} }[/math] is any sequence in [math]\displaystyle{ C }[/math] such that [math]\displaystyle{ \lim_{n \to \infty} \left\|c_n\right\| = d }[/math] in [math]\displaystyle{ \R }[/math][note 1] then [math]\displaystyle{ \lim_{n \to \infty} c_n = m }[/math] in [math]\displaystyle{ H. }[/math]

Consequences

Proposition — If [math]\displaystyle{ C }[/math] is a closed vector subspace of a Hilbert space [math]\displaystyle{ H }[/math] then[note 3] [math]\displaystyle{ H = C \oplus C^{\bot}. }[/math]

Proof[3]

Proof that [math]\displaystyle{ C \cap C^{\bot} = \{ 0 \} }[/math]:

If [math]\displaystyle{ c \in C \cap C^{\bot} }[/math] then [math]\displaystyle{ 0 = \langle \,c, \,c\, \rangle = \|c\|^2, }[/math] which implies [math]\displaystyle{ c = 0. }[/math] [math]\displaystyle{ \blacksquare }[/math]


Proof that [math]\displaystyle{ C^{\bot} }[/math] is a closed vector subspace of [math]\displaystyle{ H }[/math]:

Let [math]\displaystyle{ P := \prod_{c \in C} \mathbb{F} }[/math] where [math]\displaystyle{ \mathbb{F} }[/math] is the underlying scalar field of [math]\displaystyle{ H }[/math] and define [math]\displaystyle{ \begin{alignat}{4} L : \,& H && \to \,&& P \\ & h && \mapsto\,&& \left(\langle \,h, \,c\, \rangle\right)_{c \in C} \\ \end{alignat} }[/math] which is continuous and linear because this is true of each of its coordinates [math]\displaystyle{ h \mapsto \langle h, c \rangle. }[/math] The set [math]\displaystyle{ C^{\bot} = L^{-1}(0) = L^{-1}\left(\{ 0 \}\right) }[/math] is closed in [math]\displaystyle{ H }[/math] because [math]\displaystyle{ \{ 0 \} }[/math] is closed in [math]\displaystyle{ P }[/math] and [math]\displaystyle{ L : H \to P }[/math] is continuous. The kernel of any linear map is a vector subspace of its domain, which is why [math]\displaystyle{ C^{\bot} = \ker L }[/math] is a vector subspace of [math]\displaystyle{ H. }[/math] [math]\displaystyle{ \blacksquare }[/math]


Proof that [math]\displaystyle{ C + C^{\bot} = H }[/math]:

Let [math]\displaystyle{ x \in H. }[/math] The Hilbert projection theorem guarantees the existence of a unique [math]\displaystyle{ m \in C }[/math] such that [math]\displaystyle{ \|x - m\| \leq \|x - c\| \text{ for all } c \in C }[/math] (or equivalently, for all [math]\displaystyle{ x - c \in x - C }[/math]). Let [math]\displaystyle{ p := x - m }[/math] so that [math]\displaystyle{ x = m + p \in C + p }[/math] and it remains to show that [math]\displaystyle{ p \in C^{\bot}. }[/math] The inequality above can be rewritten as: [math]\displaystyle{ \|p\| \leq \|z\| \quad \text{ for all } z \in x - C. }[/math] Because [math]\displaystyle{ m \in C }[/math] and [math]\displaystyle{ C }[/math] is a vector space, [math]\displaystyle{ m + C = C }[/math] and [math]\displaystyle{ C = - C, }[/math] which implies that [math]\displaystyle{ x - C = x + C = p + m + C = p + C. }[/math] The previous inequality thus becomes [math]\displaystyle{ \|p\| \leq \|z\| \quad \text{ for all } z \in p + C. }[/math] or equivalently, [math]\displaystyle{ \|p\| \leq \|p + c\| \quad \text{ for all } c \in C. }[/math] But this last statement is true if and only if [math]\displaystyle{ \langle \,p, c\, \rangle = 0 }[/math] every [math]\displaystyle{ c \in C. }[/math] Thus [math]\displaystyle{ p \in C^{\bot}. }[/math] [math]\displaystyle{ \blacksquare }[/math]

Properties

Expression as a global minimum

The statement and conclusion of the Hilbert projection theorem can be expressed in terms of global minimums of the followings functions. Their notation will also be used to simplify certain statements.

Given a non-empty subset [math]\displaystyle{ C \subseteq H }[/math] and some [math]\displaystyle{ x \in H, }[/math] define a function [math]\displaystyle{ d_{C,x} : C \to [0, \infty) \quad \text{ by } c \mapsto \|x - c\|. }[/math] A global minimum point of [math]\displaystyle{ d_{C,x}, }[/math] if one exists, is any point [math]\displaystyle{ m }[/math] in [math]\displaystyle{ \,\operatorname{domain} d_{C,x} = C\, }[/math] such that [math]\displaystyle{ d_{C,x}(m) \,\leq\, d_{C,x}(c) \quad \text{ for all } c \in C, }[/math] in which case [math]\displaystyle{ d_{C,x}(m) = \|m - x\| }[/math] is equal to the global minimum value of the function [math]\displaystyle{ d_{C, x}, }[/math] which is: [math]\displaystyle{ \inf_{c \in C} d_{C,x}(c) = \inf_{c \in C} \|x - c\|. }[/math]

Effects of translations and scalings

When this global minimum point [math]\displaystyle{ m }[/math] exists and is unique then denote it by [math]\displaystyle{ \min(C, x); }[/math] explicitly, the defining properties of [math]\displaystyle{ \min(C, x) }[/math] (if it exists) are: [math]\displaystyle{ \min(C, x) \in C \quad \text { and } \quad \left\|x - \min(C, x)\right\| \leq \|x - c\| \quad \text{ for all } c \in C. }[/math] The Hilbert projection theorem guarantees that this unique minimum point exists whenever [math]\displaystyle{ C }[/math] is a non-empty closed and convex subset of a Hilbert space. However, such a minimum point can also exist in non-convex or non-closed subsets as well; for instance, just as long is [math]\displaystyle{ C }[/math] is non-empty, if [math]\displaystyle{ x \in C }[/math] then [math]\displaystyle{ \min(C, x) = x. }[/math]

If [math]\displaystyle{ C \subseteq H }[/math] is a non-empty subset, [math]\displaystyle{ s }[/math] is any scalar, and [math]\displaystyle{ x, x_0 \in H }[/math] are any vectors then [math]\displaystyle{ \,\min\left(s C + x_0, s x + x_0\right) = s \min(C, x) + x_0 }[/math] which implies: [math]\displaystyle{ \begin{alignat}{6} \min&(s C, s x) &&= s &&\min(C, x) \\ \min&(- C, - x) &&= - &&\min(C, x) \\ \end{alignat} }[/math] [math]\displaystyle{ \begin{alignat}{6} \min\left(C + x_0, x + x_0\right) &= \min(C, x) + x_0 \\ \min\left(C - x_0, x - x_0\right) &= \min(C, x) - x_0 \\ \end{alignat} }[/math] [math]\displaystyle{ \begin{alignat}{6} \min&(C, - x) {} &&= \min(C + x, 0) - x \\ \min&(C, 0) \;+\; x\;\;\;\; &&= \min(C + x, x) \\ \min&(C - x, 0) {} &&= \min(C, x) - x \\ \end{alignat} }[/math]

Examples

The following counter-example demonstrates a continuous linear isomorphism [math]\displaystyle{ A : H \to H }[/math] for which [math]\displaystyle{ \,\min(A(C), A(x)) \neq A(\min(C, x)). }[/math] Endow [math]\displaystyle{ H := \R^2 }[/math] with the dot product, let [math]\displaystyle{ x_0 := (0, 1), }[/math] and for every real [math]\displaystyle{ s \in \R, }[/math] let [math]\displaystyle{ L_s := \{ (x, s x) : x \in \R \} }[/math] be the line of slope [math]\displaystyle{ s }[/math] through the origin, where it is readily verified that [math]\displaystyle{ \min\left(L_s, x_0\right) = \frac{s}{1+s^2}(1, s). }[/math] Pick a real number [math]\displaystyle{ r \neq 0 }[/math] and define [math]\displaystyle{ A : \R^2 \to \R^2 }[/math] by [math]\displaystyle{ A(x, y) := (r x, y) }[/math] (so this map scales the [math]\displaystyle{ x- }[/math]coordinate by [math]\displaystyle{ r }[/math] while leaving the [math]\displaystyle{ y- }[/math]coordinate unchanged). Then [math]\displaystyle{ A : \R^2 \to \R^2 }[/math] is an invertible continuous linear operator that satisfies [math]\displaystyle{ A\left(L_s\right) = L_{s/r} }[/math] and [math]\displaystyle{ A\left(x_0\right) = x_0, }[/math] so that [math]\displaystyle{ \,\min\left(A\left(L_s\right), A\left(x_0\right)\right) = \frac{s}{r^2 + s^2} (1, s) }[/math] and [math]\displaystyle{ A\left(\min\left(L_s, x_0\right)\right) = \frac{s}{1 + s^2} \left(r, s\right). }[/math] Consequently, if [math]\displaystyle{ C := L_s }[/math] with [math]\displaystyle{ s \neq 0 }[/math] and if [math]\displaystyle{ (r, s) \neq (\pm 1, 1) }[/math] then [math]\displaystyle{ \,\min(A(C), A\left(x_0\right)) \neq A\left(\min\left(C, x_0\right)\right). }[/math]

See also

Notes

  1. Because the norm [math]\displaystyle{ \| \cdot \| : H \to \R }[/math] is continuous, if [math]\displaystyle{ \lim_{n \to \infty} x_n }[/math] converges in [math]\displaystyle{ H }[/math] then necessarily [math]\displaystyle{ \lim_{n \to \infty} \left\|x_n\right\| }[/math] converges in [math]\displaystyle{ \R. }[/math] But in general, the converse is not guaranteed. However, under this theorem's hypotheses, knowing that [math]\displaystyle{ \lim_{n \to \infty} \left\|c_n\right\| = d }[/math] in [math]\displaystyle{ \R }[/math] is sufficient to conclude that [math]\displaystyle{ \lim_{n \to \infty} c_n }[/math] converges in [math]\displaystyle{ H. }[/math]
  2. Explicitly, this means that given any [math]\displaystyle{ \epsilon \gt 0 }[/math] there exists some integer [math]\displaystyle{ N \gt 0 }[/math] such that "the quantity" is [math]\displaystyle{ \,\leq \epsilon }[/math] whenever [math]\displaystyle{ m, n \geq N. }[/math] Here, "the quantity" refers to the inequality's right hand side [math]\displaystyle{ 2 \left\|c_m\right\|^2 + 2 \left\|c_n\right\|^2 - 4 d^2 }[/math] and later in the proof, "the quantity" will also refer to [math]\displaystyle{ \left\|c_m - c_n\right\|^2 }[/math] and then [math]\displaystyle{ \left\|c_m - c_n\right\|. }[/math] By definition of "Cauchy sequence," [math]\displaystyle{ \left(c_n\right)_{n=1}^{\infty} }[/math] is Cauchy in [math]\displaystyle{ H }[/math] if and only if "the quantity" [math]\displaystyle{ \left\|c_m - c_n\right\| }[/math] satisfies this aforementioned condition.
  3. Technically, [math]\displaystyle{ H = K \oplus K^{\bot} }[/math] means that the addition map [math]\displaystyle{ K \times K^{\bot} \to H }[/math] defined by [math]\displaystyle{ (k, p) \mapsto k + p }[/math] is a surjective linear isomorphism and homeomorphism. See the article on complemented subspaces for more details.

References

Bibliography