Hahn–Banach theorem

From HandWiki
Short description: Theorem on extension of bounded linear functionals

The Hahn–Banach theorem is a central tool in functional analysis. It allows the extension of bounded linear functionals defined on a vector subspace of some vector space to the whole space, and it also shows that there are "enough" continuous linear functionals defined on every normed vector space to make the study of the dual space "interesting". Another version of the Hahn–Banach theorem is known as the Hahn–Banach separation theorem or the hyperplane separation theorem, and has numerous uses in convex geometry.

History

The theorem is named for the mathematicians Hans Hahn and Stefan Banach, who proved it independently in the late 1920s. The special case of the theorem for the space [math]\displaystyle{ C[a, b] }[/math] of continuous functions on an interval was proved earlier (in 1912) by Eduard Helly,[1] and a more general extension theorem, the M. Riesz extension theorem, from which the Hahn–Banach theorem can be derived, was proved in 1923 by Marcel Riesz.[2]

The first Hahn–Banach theorem was proved by Eduard Helly in 1912 who showed that certain linear functionals defined on a subspace of a certain type of normed space ([math]\displaystyle{ \Complex^{\N} }[/math]) had an extension of the same norm. Helly did this through the technique of first proving that a one-dimensional extension exists (where the linear functional has its domain extended by one dimension) and then using induction. In 1927, Hahn defined general Banach spaces and used Helly's technique to prove a norm-preserving version of Hahn–Banach theorem for Banach spaces (where a bounded linear functional on a subspace has a bounded linear extension of the same norm to the whole space). In 1929, Banach, who was unaware of Hahn's result, generalized it by replacing the norm-preserving version with the dominated extension version that uses sublinear functions. Whereas Helly's proof used mathematical induction, Hahn and Banach both used transfinite induction.[3]

The Hahn–Banach theorem arose from attempts to solve infinite systems of linear equations. This is needed to solve problems such as the moment problem, whereby given all the potential moments of a function one must determine if a function having these moments exists, and, if so, find it in terms of those moments. Another such problem is the Fourier cosine series problem, whereby given all the potential Fourier cosine coefficients one must determine if a function having those coefficients exists, and, again, find it if so.

Riesz and Helly solved the problem for certain classes of spaces (such as [math]\displaystyle{ L^p([0, 1]) }[/math] and [math]\displaystyle{ C([a, b]) }[/math]) where they discovered that the existence of a solution was equivalent to the existence and continuity of certain linear functionals. In effect, they needed to solve the following problem:[3]

(The vector problem) Given a collection [math]\displaystyle{ \left(f_i\right)_{i \in I} }[/math] of bounded linear functionals on a normed space [math]\displaystyle{ X }[/math] and a collection of scalars [math]\displaystyle{ \left(c_i\right)_{i \in I}, }[/math] determine if there is an [math]\displaystyle{ x \in X }[/math] such that [math]\displaystyle{ f_i(x) = c_i }[/math] for all [math]\displaystyle{ i \in I. }[/math]

If [math]\displaystyle{ X }[/math] happens to be a reflexive space then to solve the vector problem, it suffices to solve the following dual problem:[3]

(The functional problem) Given a collection [math]\displaystyle{ \left(x_i\right)_{i \in I} }[/math] of vectors in a normed space [math]\displaystyle{ X }[/math] and a collection of scalars [math]\displaystyle{ \left(c_i\right)_{i \in I}, }[/math] determine if there is a bounded linear functional [math]\displaystyle{ f }[/math] on [math]\displaystyle{ X }[/math] such that [math]\displaystyle{ f\left(x_i\right) = c_i }[/math] for all [math]\displaystyle{ i \in I. }[/math]

Riesz went on to define [math]\displaystyle{ L^p([0, 1]) }[/math] space ([math]\displaystyle{ 1 \lt p \lt \infty }[/math]) in 1910 and the [math]\displaystyle{ \ell^p }[/math] spaces in 1913. While investigating these spaces he proved a special case of the Hahn–Banach theorem. Helly also proved a special case of the Hahn–Banach theorem in 1912. In 1910, Riesz solved the functional problem for some specific spaces and in 1912, Helly solved it for a more general class of spaces. It wasn't until 1932 that Banach, in one of the first important applications of the Hahn–Banach theorem, solved the general functional problem. The following theorem states the general functional problem and characterizes its solution.[3]

Theorem[3] (The functional problem) — Let [math]\displaystyle{ \left(x_i\right)_{i \in I} }[/math] be vectors in a real or complex normed space [math]\displaystyle{ X }[/math] and let [math]\displaystyle{ \left(c_i\right)_{i \in I} }[/math] be scalars also indexed by [math]\displaystyle{ I \neq \varnothing. }[/math]

There exists a continuous linear functional [math]\displaystyle{ f }[/math] on [math]\displaystyle{ X }[/math] such that [math]\displaystyle{ f\left(x_i\right) = c_i }[/math] for all [math]\displaystyle{ i \in I }[/math] if and only if there exists a [math]\displaystyle{ K \gt 0 }[/math] such that for any choice of scalars [math]\displaystyle{ \left(s_i\right)_{i \in I} }[/math] where all but finitely many [math]\displaystyle{ s_i }[/math] are [math]\displaystyle{ 0, }[/math] the following holds: [math]\displaystyle{ \left|\sum_{i \in I} s_i c_i\right| \leq K \left\|\sum_{i \in I} s_i x_i\right\|. }[/math]

The Hahn–Banach theorem can be deduced from the above theorem.[3] If [math]\displaystyle{ X }[/math] is reflexive then this theorem solves the vector problem.

Hahn–Banach theorem

A real-valued function [math]\displaystyle{ f : M \to \R }[/math] defined on a subset [math]\displaystyle{ M }[/math] of [math]\displaystyle{ X }[/math] is said to be dominated (above) by a function [math]\displaystyle{ p : X \to \R }[/math] if [math]\displaystyle{ f(m) \leq p(m) }[/math] for every [math]\displaystyle{ m \in M. }[/math] Hence the reason why the following version of the Hahn–Banach theorem is called the dominated extension theorem.

Hahn–Banach dominated extension theorem (for real linear functionals)[4][5][6] — If [math]\displaystyle{ p : X \to \R }[/math] is a sublinear function (such as a norm or seminorm for example) defined on a real vector space [math]\displaystyle{ X }[/math] then any linear functional defined on a vector subspace of [math]\displaystyle{ X }[/math] that is dominated above by [math]\displaystyle{ p }[/math] has at least one linear extension to all of [math]\displaystyle{ X }[/math] that is also dominated above by [math]\displaystyle{ p. }[/math]

Explicitly, if [math]\displaystyle{ p : X \to \R }[/math] is a sublinear function, which by definition means that it satisfies [math]\displaystyle{ p(x + y) \leq p(x) + p(y) \quad \text{ and } \quad p(t x) = t p(x) \qquad \text{ for all } \; x, y \in X \; \text{ and all real } \; t \geq 0, }[/math] and if [math]\displaystyle{ f : M \to \R }[/math] is a linear functional defined on a vector subspace [math]\displaystyle{ M }[/math] of [math]\displaystyle{ X }[/math] such that [math]\displaystyle{ f(m) \leq p(m) \quad \text{ for all } m \in M }[/math] then there exists a linear functional [math]\displaystyle{ F : X \to \R }[/math] such that [math]\displaystyle{ F(m) = f(m) \quad \text{ for all } m \in M, }[/math] [math]\displaystyle{ F(x) \leq p(x) \quad ~\;\, \text{ for all } x \in X. }[/math] Moreover, if [math]\displaystyle{ p }[/math] is a seminorm then [math]\displaystyle{ |F(x)| \leq p(x) }[/math] necessarily holds for all [math]\displaystyle{ x \in X. }[/math]

The theorem remains true if the requirements on [math]\displaystyle{ p }[/math] are relaxed to require only that [math]\displaystyle{ p }[/math] be a convex function:[7][8] [math]\displaystyle{ p(t x + (1 - t) y) \leq t p(x) + (1 - t) p(y) \qquad \text{ for all } 0 \lt t \lt 1 \text{ and } x, y \in X. }[/math] A function [math]\displaystyle{ p : X \to \R }[/math] is convex and satisfies [math]\displaystyle{ p(0) \leq 0 }[/math] if and only if [math]\displaystyle{ p(a x + b y) \leq a p(x) + b p(y) }[/math] for all vectors [math]\displaystyle{ x, y \in X }[/math] and all non-negative real [math]\displaystyle{ a, b \geq 0 }[/math] such that [math]\displaystyle{ a + b \leq 1. }[/math] Every sublinear function is a convex function. On the other hand, if [math]\displaystyle{ p : X \to \R }[/math] is convex with [math]\displaystyle{ p(0) \geq 0, }[/math] then the function defined by [math]\displaystyle{ p_0(x) \;\stackrel{\scriptscriptstyle\text{def}}{=}\; \inf_{t \gt 0} \frac{p(tx)}{t} }[/math] is positively homogeneous (because for all [math]\displaystyle{ x }[/math] and [math]\displaystyle{ r\gt 0 }[/math] one has [math]\displaystyle{ p_0(rx)=\inf_{t \gt 0} \frac{p(trx)}{t} ) =r\inf_{t \gt 0} \frac{p(trx)}{tr} = r\inf_{\tau \gt 0} \frac{p(\tau x)}{\tau}=rp_0(x) }[/math]), hence, being convex, it is sublinear. It is also bounded above by [math]\displaystyle{ p_0 \leq p, }[/math] and satisfies [math]\displaystyle{ F \leq p_0 }[/math] for every linear functional [math]\displaystyle{ F \leq p. }[/math] So the extension of the Hahn–Banach theorem to convex functionals does not have a much larger content than the classical one stated for sublinear functionals.

If [math]\displaystyle{ F : X \to \R }[/math] is linear then [math]\displaystyle{ F \leq p }[/math] if and only if[4] [math]\displaystyle{ -p(-x) \leq F(x) \leq p(x) \quad \text{ for all } x \in X, }[/math] which is the (equivalent) conclusion that some authors[4] write instead of [math]\displaystyle{ F \leq p. }[/math] It follows that if [math]\displaystyle{ p : X \to \R }[/math] is also symmetric, meaning that [math]\displaystyle{ p(-x) = p(x) }[/math] holds for all [math]\displaystyle{ x \in X, }[/math] then [math]\displaystyle{ F \leq p }[/math] if and only [math]\displaystyle{ |F| \leq p. }[/math] Every norm is a seminorm and both are symmetric balanced sublinear functions. A sublinear function is a seminorm if and only if it is a balanced function. On a real vector space (although not on a complex vector space), a sublinear function is a seminorm if and only if it is symmetric. The identity function [math]\displaystyle{ \R \to \R }[/math] on [math]\displaystyle{ X := \R }[/math] is an example of a sublinear function that is not a seminorm.

For complex or real vector spaces

The dominated extension theorem for real linear functionals implies the following alternative statement of the Hahn–Banach theorem that can be applied to linear functionals on real or complex vector spaces.

Hahn–Banach theorem[3][9] — Suppose [math]\displaystyle{ p : X \to \R }[/math] a seminorm on a vector space [math]\displaystyle{ X }[/math] over the field [math]\displaystyle{ \mathbf{K}, }[/math] which is either [math]\displaystyle{ \R }[/math] or [math]\displaystyle{ \Complex. }[/math] If [math]\displaystyle{ f : M \to \mathbf{K} }[/math] is a linear functional on a vector subspace [math]\displaystyle{ M }[/math] such that [math]\displaystyle{ |f(m)| \leq p(m) \quad \text{ for all } m \in M, }[/math] then there exists a linear functional [math]\displaystyle{ F : X \to \mathbf{K} }[/math] such that [math]\displaystyle{ F(m) = f(m) \quad \; \text{ for all } m \in M, }[/math] [math]\displaystyle{ |F(x)| \leq p(x) \quad \;\, \text{ for all } x \in X. }[/math]

The theorem remains true if the requirements on [math]\displaystyle{ p }[/math] are relaxed to require only that for all [math]\displaystyle{ x, y \in X }[/math] and all scalars [math]\displaystyle{ a }[/math] and [math]\displaystyle{ b }[/math] satisfying [math]\displaystyle{ |a| + |b| \leq 1, }[/math][8] [math]\displaystyle{ p(a x + b y) \leq |a| p(x) + |b| p(y). }[/math] This condition holds if and only if [math]\displaystyle{ p }[/math] is a convex and balanced function satisfying [math]\displaystyle{ p(0) \leq 0, }[/math] or equivalently, if and only if it is convex, satisfies [math]\displaystyle{ p(0) \leq 0, }[/math] and [math]\displaystyle{ p(u x) \leq p(x) }[/math] for all [math]\displaystyle{ x \in X }[/math] and all unit length scalars [math]\displaystyle{ u. }[/math]

A complex-valued functional [math]\displaystyle{ F }[/math] is said to be dominated by [math]\displaystyle{ p }[/math] if [math]\displaystyle{ |F(x)| \leq p(x) }[/math] for all [math]\displaystyle{ x }[/math] in the domain of [math]\displaystyle{ F. }[/math] With this terminology, the above statements of the Hahn–Banach theorem can be restated more succinctly:

Hahn–Banach dominated extension theorem: If [math]\displaystyle{ p : X \to \R }[/math] is a seminorm defined on a real or complex vector space [math]\displaystyle{ X, }[/math] then every dominated linear functional defined on a vector subspace of [math]\displaystyle{ X }[/math] has a dominated linear extension to all of [math]\displaystyle{ X. }[/math] In the case where [math]\displaystyle{ X }[/math] is a real vector space and [math]\displaystyle{ p : X \to \R }[/math] is merely a convex or sublinear function, this conclusion will remain true if both instances of "dominated" (meaning [math]\displaystyle{ |F| \leq p }[/math]) are weakened to instead mean "dominated above" (meaning [math]\displaystyle{ F \leq p }[/math]).[7][8]

Proof

The following observations allow the Hahn–Banach theorem for real vector spaces to be applied to (complex-valued) linear functionals on complex vector spaces.

Every linear functional [math]\displaystyle{ F : X \to \Complex }[/math] on a complex vector space is completely determined by its real part [math]\displaystyle{ \; \operatorname{Re} F : X \to \R \; }[/math] through the formula[6][proof 1] [math]\displaystyle{ F(x) \;=\; \operatorname{Re} F(x) - i \operatorname{Re} F(i x) \qquad \text{ for all } x \in X }[/math] and moreover, if [math]\displaystyle{ \|\cdot\| }[/math] is a norm on [math]\displaystyle{ X }[/math] then their dual norms are equal: [math]\displaystyle{ \|F\| = \|\operatorname{Re} F\|. }[/math][10] In particular, a linear functional on [math]\displaystyle{ X }[/math] extends another one defined on [math]\displaystyle{ M \subseteq X }[/math] if and only if their real parts are equal on [math]\displaystyle{ M }[/math] (in other words, a linear functional [math]\displaystyle{ F }[/math] extends [math]\displaystyle{ f }[/math] if and only if [math]\displaystyle{ \operatorname{Re} F }[/math] extends [math]\displaystyle{ \operatorname{Re} f }[/math]). The real part of a linear functional on [math]\displaystyle{ X }[/math] is always a real-linear functional (meaning that it is linear when [math]\displaystyle{ X }[/math] is considered as a real vector space) and if [math]\displaystyle{ R : X \to \R }[/math] is a real-linear functional on a complex vector space then [math]\displaystyle{ x \mapsto R(x) - i R(i x) }[/math] defines the unique linear functional on [math]\displaystyle{ X }[/math] whose real part is [math]\displaystyle{ R. }[/math]

If [math]\displaystyle{ F }[/math] is a linear functional on a (complex or real) vector space [math]\displaystyle{ X }[/math] and if [math]\displaystyle{ p : X \to \R }[/math] is a seminorm then[6][proof 2] [math]\displaystyle{ |F| \,\leq\, p \quad \text{ if and only if } \quad \operatorname{Re} F \,\leq\, p. }[/math] Stated in simpler language, a linear functional is dominated by a seminorm [math]\displaystyle{ p }[/math] if and only if its real part is dominated above by [math]\displaystyle{ p. }[/math]

The proof above shows that when [math]\displaystyle{ p }[/math] is a seminorm then there is a one-to-one correspondence between dominated linear extensions of [math]\displaystyle{ f : M \to \Complex }[/math] and dominated real-linear extensions of [math]\displaystyle{ \operatorname{Re} f : M \to \R; }[/math] the proof even gives a formula for explicitly constructing a linear extension of [math]\displaystyle{ f }[/math] from any given real-linear extension of its real part.

Continuity

A linear functional [math]\displaystyle{ F }[/math] on a topological vector space is continuous if and only if this is true of its real part [math]\displaystyle{ \operatorname{Re} F; }[/math] if the domain is a normed space then [math]\displaystyle{ \|F\| = \|\operatorname{Re} F\| }[/math] (where one side is infinite if and only if the other side is infinite).[10] Assume [math]\displaystyle{ X }[/math] is a topological vector space and [math]\displaystyle{ p : X \to \R }[/math] is sublinear function. If [math]\displaystyle{ p }[/math] is a continuous sublinear function that dominates a linear functional [math]\displaystyle{ F }[/math] then [math]\displaystyle{ F }[/math] is necessarily continuous.[6] Moreover, a linear functional [math]\displaystyle{ F }[/math] is continuous if and only if its absolute value [math]\displaystyle{ |F| }[/math] (which is a seminorm that dominates [math]\displaystyle{ F }[/math]) is continuous.[6] In particular, a linear functional is continuous if and only if it is dominated by some continuous sublinear function.

Proof

The Hahn–Banach theorem for real vector spaces ultimately follows from Helly's initial result for the special case where the linear functional is extended from [math]\displaystyle{ M }[/math] to a larger vector space in which [math]\displaystyle{ M }[/math] has codimension [math]\displaystyle{ 1. }[/math][3]

Lemma[6] (One–dimensional dominated extension theorem) — Let [math]\displaystyle{ p : X \to \R }[/math] be a sublinear function on a real vector space [math]\displaystyle{ X, }[/math] let [math]\displaystyle{ f : M \to \R }[/math] a linear functional on a proper vector subspace [math]\displaystyle{ M \subsetneq X }[/math] such that [math]\displaystyle{ f \leq p }[/math] on [math]\displaystyle{ M }[/math] (meaning [math]\displaystyle{ f(m) \leq p(m) }[/math] for all [math]\displaystyle{ m \in M }[/math]), and let [math]\displaystyle{ x \in X }[/math] be a vector not in [math]\displaystyle{ M }[/math] (so [math]\displaystyle{ M \oplus \R x = \operatorname{span} \{M, x\} }[/math]). There exists a linear extension [math]\displaystyle{ F : M \oplus \R x \to \R }[/math] of [math]\displaystyle{ f }[/math] such that [math]\displaystyle{ F \leq p }[/math] on [math]\displaystyle{ M \oplus \R x. }[/math]

This lemma remains true if [math]\displaystyle{ p : X \to \R }[/math] is merely a convex function instead of a sublinear function.[7][8]

Proof

Assume that [math]\displaystyle{ p }[/math] is convex, which means that [math]\displaystyle{ p(t y + (1 - t) z) \leq t p(y) + (1 - t) p(z) }[/math] for all [math]\displaystyle{ 0 \leq t \leq 1 }[/math] and [math]\displaystyle{ y, z \in X. }[/math] Let [math]\displaystyle{ M, }[/math] [math]\displaystyle{ f : M \to \R, }[/math] and [math]\displaystyle{ x \in X \setminus M }[/math] be as in the lemma's statement. Given any [math]\displaystyle{ m, n \in M }[/math] and any positive real [math]\displaystyle{ r, s \gt 0, }[/math] the positive real numbers [math]\displaystyle{ t := \tfrac{s}{r + s} }[/math] and [math]\displaystyle{ \tfrac{r}{r + s} = 1 - t }[/math] sum to [math]\displaystyle{ 1 }[/math] so that the convexity of [math]\displaystyle{ p }[/math] on [math]\displaystyle{ X }[/math] guarantees [math]\displaystyle{ \begin{alignat}{9} p\left(\tfrac{s}{r + s} m + \tfrac{r}{r + s} n\right) ~&=~ p\big(\tfrac{s}{r + s} (m - r x) &&+ \tfrac{r}{r + s} (n + s x)\big) && \\ &\leq~ \tfrac{s}{r + s} \; p(m - r x) &&+ \tfrac{r}{r + s} \; p(n + s x) && \\ \end{alignat} }[/math] and hence [math]\displaystyle{ \begin{alignat}{9} s f(m) + r f(n) ~&=~ (r + s) \; f\left(\tfrac{s}{r + s} m + \tfrac{r}{r + s} n\right) && \qquad \text{ by linearity of } f \\ &\leq~ (r + s) \; p\left(\tfrac{s}{r + s} m + \tfrac{r}{r + s} n\right) && \qquad f \leq p \text{ on } M \\ &\leq~ s p(m - r x) + r p(n + s x) \\ \end{alignat} }[/math] thus proving that [math]\displaystyle{ - s p(m - r x) + s f(m) ~\leq~ r p(n + s x) - r f(n), }[/math] which after multiplying both sides by [math]\displaystyle{ \tfrac{1}{rs} }[/math] becomes [math]\displaystyle{ \tfrac{1}{r} [- p(m - r x) + f(m)] ~\leq~ \tfrac{1}{s} [p(n + s x) - f(n)]. }[/math] This implies that the values defined by [math]\displaystyle{ a = \sup_{\stackrel{m \in M}{r \gt 0}} \tfrac{1}{r} [- p(m - r x) + f(m)] \qquad \text{ and } \qquad c = \inf_{\stackrel{n \in M}{s \gt 0}} \tfrac{1}{s} [p(n + s x) - f(n)] }[/math] are real numbers that satisfy [math]\displaystyle{ a \leq c. }[/math] As in the above proof of the one–dimensional dominated extension theorem above, for any real [math]\displaystyle{ b \in \R }[/math] define [math]\displaystyle{ F_b : M \oplus \R x \to \R }[/math] by [math]\displaystyle{ F_b(m + r x) = f(m) + r b. }[/math] It can be verified that if [math]\displaystyle{ a \leq b \leq c }[/math] then [math]\displaystyle{ F_b \leq p }[/math] where [math]\displaystyle{ r b \leq p(m + r x) - f(m) }[/math] follows from [math]\displaystyle{ b \leq c }[/math] when [math]\displaystyle{ r \gt 0 }[/math] (respectively, follows from [math]\displaystyle{ a \leq b }[/math] when [math]\displaystyle{ r \lt 0 }[/math]). [math]\displaystyle{ \blacksquare }[/math]

The lemma above is the key step in deducing the dominated extension theorem from Zorn's lemma.

When [math]\displaystyle{ M }[/math] has countable codimension, then using induction and the lemma completes the proof of the Hahn–Banach theorem. The standard proof of the general case uses Zorn's lemma although the strictly weaker ultrafilter lemma[11] (which is equivalent to the compactness theorem and to the Boolean prime ideal theorem) may be used instead. Hahn–Banach can also be proved using Tychonoff's theorem for compact Hausdorff spaces[12] (which is also equivalent to the ultrafilter lemma)

The Mizar project has completely formalized and automatically checked the proof of the Hahn–Banach theorem in the HAHNBAN file.[13]

Continuous extension theorem

The Hahn–Banach theorem can be used to guarantee the existence of continuous linear extensions of continuous linear functionals.

Hahn–Banach continuous extension theorem[14] — Every continuous linear functional [math]\displaystyle{ f }[/math] defined on a vector subspace [math]\displaystyle{ M }[/math] of a (real or complex) locally convex topological vector space [math]\displaystyle{ X }[/math] has a continuous linear extension [math]\displaystyle{ F }[/math] to all of [math]\displaystyle{ X. }[/math] If in addition [math]\displaystyle{ X }[/math] is a normed space, then this extension can be chosen so that its dual norm is equal to that of [math]\displaystyle{ f. }[/math]

In category-theoretic terms, the underlying field of the vector space is an injective object in the category of locally convex vector spaces.

On a normed (or seminormed) space, a linear extension [math]\displaystyle{ F }[/math] of a bounded linear functional [math]\displaystyle{ f }[/math] is said to be norm-preserving if it has the same dual norm as the original functional: [math]\displaystyle{ \|F\| = \|f\|. }[/math] Because of this terminology, the second part of the above theorem is sometimes referred to as the "norm-preserving" version of the Hahn–Banach theorem.[15] Explicitly:

Norm-preserving Hahn–Banach continuous extension theorem[15] — Every continuous linear functional [math]\displaystyle{ f }[/math] defined on a vector subspace [math]\displaystyle{ M }[/math] of a (real or complex) normed space [math]\displaystyle{ X }[/math] has a continuous linear extension [math]\displaystyle{ F }[/math] to all of [math]\displaystyle{ X }[/math] that satisfies [math]\displaystyle{ \|f\| = \|F\|. }[/math]

Proof of the continuous extension theorem

The following observations allow the continuous extension theorem to be deduced from the Hahn–Banach theorem.[16]

The absolute value of a linear functional is always a seminorm. A linear functional [math]\displaystyle{ F }[/math] on a topological vector space [math]\displaystyle{ X }[/math] is continuous if and only if its absolute value [math]\displaystyle{ |F| }[/math] is continuous, which happens if and only if there exists a continuous seminorm [math]\displaystyle{ p }[/math] on [math]\displaystyle{ X }[/math] such that [math]\displaystyle{ |F| \leq p }[/math] on the domain of [math]\displaystyle{ F. }[/math][17] If [math]\displaystyle{ X }[/math] is a locally convex space then this statement remains true when the linear functional [math]\displaystyle{ F }[/math] is defined on a proper vector subspace of [math]\displaystyle{ X. }[/math]

Proof for normed spaces

A linear functional [math]\displaystyle{ f }[/math] on a normed space is continuous if and only if it is bounded, which means that its dual norm [math]\displaystyle{ \|f\| = \sup \{|f(m)| : \|m\| \leq 1, m \in \operatorname{domain} f\} }[/math] is finite, in which case [math]\displaystyle{ |f(m)| \leq \|f\| \|m\| }[/math] holds for every point [math]\displaystyle{ m }[/math] in its domain. Moreover, if [math]\displaystyle{ c \geq 0 }[/math] is such that [math]\displaystyle{ |f(m)| \leq c \|m\| }[/math] for all [math]\displaystyle{ m }[/math] in the functional's domain, then necessarily [math]\displaystyle{ \|f\| \leq c. }[/math] If [math]\displaystyle{ F }[/math] is a linear extension of a linear functional [math]\displaystyle{ f }[/math] then their dual norms always satisfy [math]\displaystyle{ \|f\| \leq \|F\| }[/math][proof 3] so that equality [math]\displaystyle{ \|f\| = \|F\| }[/math] is equivalent to [math]\displaystyle{ \|F\| \leq \|f\|, }[/math] which holds if and only if [math]\displaystyle{ |F(x)| \leq \|f\| \|x\| }[/math] for every point [math]\displaystyle{ x }[/math] in the extension's domain. This can be restated in terms of the function [math]\displaystyle{ \|f\| \, \|\cdot\| : X \to \Reals }[/math] defined by [math]\displaystyle{ x \mapsto \|f\| \, \|x\|, }[/math] which is always a seminorm:[note 4]

A linear extension of a bounded linear functional [math]\displaystyle{ f }[/math] is norm-preserving if and only if the extension is dominated by the seminorm [math]\displaystyle{ \|f\| \, \|\cdot\|. }[/math]

Applying the Hahn–Banach theorem to [math]\displaystyle{ f }[/math] with this seminorm [math]\displaystyle{ \|f\| \, \|\cdot\| }[/math] thus produces a dominated linear extension whose norm is (necessarily) equal to that of [math]\displaystyle{ f, }[/math] which proves the theorem:

Non-locally convex spaces

The continuous extension theorem might fail if the topological vector space (TVS) [math]\displaystyle{ X }[/math] is not locally convex. For example, for [math]\displaystyle{ 0 \lt p \lt 1, }[/math] the Lebesgue space [math]\displaystyle{ L^p([0, 1]) }[/math] is a complete metrizable TVS (an F-space) that is not locally convex (in fact, its only convex open subsets are itself [math]\displaystyle{ L^p([0, 1]) }[/math] and the empty set) and the only continuous linear functional on [math]\displaystyle{ L^p([0, 1]) }[/math] is the constant [math]\displaystyle{ 0 }[/math] function (Rudin 1991). Since [math]\displaystyle{ L^p([0, 1]) }[/math] is Hausdorff, every finite-dimensional vector subspace [math]\displaystyle{ M \subseteq L^p([0, 1]) }[/math] is linearly homeomorphic to Euclidean space [math]\displaystyle{ \Reals^{\dim M} }[/math] or [math]\displaystyle{ \Complex^{\dim M} }[/math] (by F. Riesz's theorem) and so every non-zero linear functional [math]\displaystyle{ f }[/math] on [math]\displaystyle{ M }[/math] is continuous but none has a continuous linear extension to all of [math]\displaystyle{ L^p([0, 1]). }[/math] However, it is possible for a TVS [math]\displaystyle{ X }[/math] to not be locally convex but nevertheless have enough continuous linear functionals that its continuous dual space [math]\displaystyle{ X^* }[/math] separates points; for such a TVS, a continuous linear functional defined on a vector subspace might have a continuous linear extension to the whole space.

If the TVS [math]\displaystyle{ X }[/math] is not locally convex then there might not exist any continuous seminorm [math]\displaystyle{ p : X \to \R }[/math] defined on [math]\displaystyle{ X }[/math] (not just on [math]\displaystyle{ M }[/math]) that dominates [math]\displaystyle{ f, }[/math] in which case the Hahn–Banach theorem can not be applied as it was in the above proof of the continuous extension theorem. However, the proof's argument can be generalized to give a characterization of when a continuous linear functional has a continuous linear extension: If [math]\displaystyle{ X }[/math] is any TVS (not necessarily locally convex), then a continuous linear functional [math]\displaystyle{ f }[/math] defined on a vector subspace [math]\displaystyle{ M }[/math] has a continuous linear extension [math]\displaystyle{ F }[/math] to all of [math]\displaystyle{ X }[/math] if and only if there exists some continuous seminorm [math]\displaystyle{ p }[/math] on [math]\displaystyle{ X }[/math] that dominates [math]\displaystyle{ f. }[/math] Specifically, if given a continuous linear extension [math]\displaystyle{ F }[/math] then [math]\displaystyle{ p := |F| }[/math] is a continuous seminorm on [math]\displaystyle{ X }[/math] that dominates [math]\displaystyle{ f; }[/math] and conversely, if given a continuous seminorm [math]\displaystyle{ p : X \to \Reals }[/math] on [math]\displaystyle{ X }[/math] that dominates [math]\displaystyle{ f }[/math] then any dominated linear extension of [math]\displaystyle{ f }[/math] to [math]\displaystyle{ X }[/math] (the existence of which is guaranteed by the Hahn–Banach theorem) will be a continuous linear extension.

Geometric Hahn–Banach (the Hahn–Banach separation theorems)

The key element of the Hahn–Banach theorem is fundamentally a result about the separation of two convex sets: [math]\displaystyle{ \{-p(- x - n) - f(n) : n \in M\}, }[/math] and [math]\displaystyle{ \{p(m + x) - f(m) : m \in M\}. }[/math] This sort of argument appears widely in convex geometry,[18] optimization theory, and economics. Lemmas to this end derived from the original Hahn–Banach theorem are known as the Hahn–Banach separation theorems.[19][20] They are generalizations of the hyperplane separation theorem, which states that two disjoint nonempty convex subsets of a finite-dimensional space [math]\displaystyle{ \R^n }[/math] can be separated by some affine hyperplane, which is a fiber (level set) of the form [math]\displaystyle{ f^{-1}(s) = \{x : f(x) = s\} }[/math] where [math]\displaystyle{ f \neq 0 }[/math] is a non-zero linear functional and [math]\displaystyle{ s }[/math] is a scalar.

Theorem[19] — Let [math]\displaystyle{ A }[/math] and [math]\displaystyle{ B }[/math] be non-empty convex subsets of a real locally convex topological vector space [math]\displaystyle{ X. }[/math] If [math]\displaystyle{ \operatorname{Int} A \neq \varnothing }[/math] and [math]\displaystyle{ B \cap \operatorname{Int} A = \varnothing }[/math] then there exists a continuous linear functional [math]\displaystyle{ f }[/math] on [math]\displaystyle{ X }[/math] such that [math]\displaystyle{ \sup f(A) \leq \inf f(B) }[/math] and [math]\displaystyle{ f(a) \lt \inf f(B) }[/math] for all [math]\displaystyle{ a \in \operatorname{Int} A }[/math] (such an [math]\displaystyle{ f }[/math] is necessarily non-zero).

When the convex sets have additional properties, such as being open or compact for example, then the conclusion can be substantially strengthened:

Theorem[3][21] — Let [math]\displaystyle{ A }[/math] and [math]\displaystyle{ B }[/math] be convex non-empty disjoint subsets of a real topological vector space [math]\displaystyle{ X. }[/math]

  • If [math]\displaystyle{ A }[/math] is open then [math]\displaystyle{ A }[/math] and [math]\displaystyle{ B }[/math] are separated by a closed hyperplane. Explicitly, this means that there exists a continuous linear map [math]\displaystyle{ f : X \to \mathbf{K} }[/math] and [math]\displaystyle{ s \in \R }[/math] such that [math]\displaystyle{ f(a) \lt s \leq f(b) }[/math] for all [math]\displaystyle{ a \in A, b \in B. }[/math] If both [math]\displaystyle{ A }[/math] and [math]\displaystyle{ B }[/math] are open then the right-hand side may be taken strict as well.
  • If [math]\displaystyle{ X }[/math] is locally convex, [math]\displaystyle{ A }[/math] is compact, and [math]\displaystyle{ B }[/math] closed, then [math]\displaystyle{ A }[/math] and [math]\displaystyle{ B }[/math] are strictly separated: there exists a continuous linear map [math]\displaystyle{ f : X \to \mathbf{K} }[/math] and [math]\displaystyle{ s, t \in \R }[/math] such that [math]\displaystyle{ f(a) \lt t \lt s \lt f(b) }[/math] for all [math]\displaystyle{ a \in A, b \in B. }[/math]

If [math]\displaystyle{ X }[/math] is complex (rather than real) then the same claims hold, but for the real part of [math]\displaystyle{ f. }[/math]

Then following important corollary is known as the Geometric Hahn–Banach theorem or Mazur's theorem (also known as Ascoli–Mazur theorem[22]). It follows from the first bullet above and the convexity of [math]\displaystyle{ M. }[/math]

Theorem (Mazur)[23] — Let [math]\displaystyle{ M }[/math] be a vector subspace of the topological vector space [math]\displaystyle{ X }[/math] and suppose [math]\displaystyle{ K }[/math] is a non-empty convex open subset of [math]\displaystyle{ X }[/math] with [math]\displaystyle{ K \cap M = \varnothing. }[/math] Then there is a closed hyperplane (codimension-1 vector subspace) [math]\displaystyle{ N \subseteq X }[/math] that contains [math]\displaystyle{ M, }[/math] but remains disjoint from [math]\displaystyle{ K. }[/math]

Mazur's theorem clarifies that vector subspaces (even those that are not closed) can be characterized by linear functionals.

Corollary[24] (Separation of a subspace and an open convex set) — Let [math]\displaystyle{ M }[/math] be a vector subspace of a locally convex topological vector space [math]\displaystyle{ X, }[/math] and [math]\displaystyle{ U }[/math] be a non-empty open convex subset disjoint from [math]\displaystyle{ M. }[/math] Then there exists a continuous linear functional [math]\displaystyle{ f }[/math] on [math]\displaystyle{ X }[/math] such that [math]\displaystyle{ f(m) = 0 }[/math] for all [math]\displaystyle{ m \in M }[/math] and [math]\displaystyle{ \operatorname{Re} f \gt 0 }[/math] on [math]\displaystyle{ U. }[/math]

Supporting hyperplanes

Since points are trivially convex, geometric Hahn–Banach implies that functionals can detect the boundary of a set. In particular, let [math]\displaystyle{ X }[/math] be a real topological vector space and [math]\displaystyle{ A \subseteq X }[/math] be convex with [math]\displaystyle{ \operatorname{Int} A \neq \varnothing. }[/math] If [math]\displaystyle{ a_0 \in A \setminus \operatorname{Int} A }[/math] then there is a functional that is vanishing at [math]\displaystyle{ a_0, }[/math] but supported on the interior of [math]\displaystyle{ A. }[/math][19]

Call a normed space [math]\displaystyle{ X }[/math] smooth if at each point [math]\displaystyle{ x }[/math] in its unit ball there exists a unique closed hyperplane to the unit ball at [math]\displaystyle{ x. }[/math] Köthe showed in 1983 that a normed space is smooth at a point [math]\displaystyle{ x }[/math] if and only if the norm is Gateaux differentiable at that point.[3]

Balanced or disked neighborhoods

Let [math]\displaystyle{ U }[/math] be a convex balanced neighborhood of the origin in a locally convex topological vector space [math]\displaystyle{ X }[/math] and suppose [math]\displaystyle{ x \in X }[/math] is not an element of [math]\displaystyle{ U. }[/math] Then there exists a continuous linear functional [math]\displaystyle{ f }[/math] on [math]\displaystyle{ X }[/math] such that[3] [math]\displaystyle{ \sup |f(U)| \leq |f(x)|. }[/math]

Applications

The Hahn–Banach theorem is the first sign of an important philosophy in functional analysis: to understand a space, one should understand its continuous functionals.

For example, linear subspaces are characterized by functionals: if X is a normed vector space with linear subspace M (not necessarily closed) and if [math]\displaystyle{ z }[/math] is an element of X not in the closure of M, then there exists a continuous linear map [math]\displaystyle{ f : X \to \mathbf{K} }[/math] with [math]\displaystyle{ f(m) = 0 }[/math] for all [math]\displaystyle{ m \in M, }[/math] [math]\displaystyle{ f(z) = 1, }[/math] and [math]\displaystyle{ \|f\| = \operatorname{dist}(z, M)^{-1}. }[/math] (To see this, note that [math]\displaystyle{ \operatorname{dist}(\cdot, M) }[/math] is a sublinear function.) Moreover, if [math]\displaystyle{ z }[/math] is an element of X, then there exists a continuous linear map [math]\displaystyle{ f : X \to \mathbf{K} }[/math] such that [math]\displaystyle{ f(z) = \|z\| }[/math] and [math]\displaystyle{ \|f\| \leq 1. }[/math] This implies that the natural injection [math]\displaystyle{ J }[/math] from a normed space X into its double dual [math]\displaystyle{ V^{**} }[/math] is isometric.

That last result also suggests that the Hahn–Banach theorem can often be used to locate a "nicer" topology in which to work. For example, many results in functional analysis assume that a space is Hausdorff or locally convex. However, suppose X is a topological vector space, not necessarily Hausdorff or locally convex, but with a nonempty, proper, convex, open set M. Then geometric Hahn–Banach implies that there is a hyperplane separating M from any other point. In particular, there must exist a nonzero functional on X — that is, the continuous dual space [math]\displaystyle{ X^* }[/math] is non-trivial.[3][25] Considering X with the weak topology induced by [math]\displaystyle{ X^*, }[/math] then X becomes locally convex; by the second bullet of geometric Hahn–Banach, the weak topology on this new space separates points. Thus X with this weak topology becomes Hausdorff. This sometimes allows some results from locally convex topological vector spaces to be applied to non-Hausdorff and non-locally convex spaces.

Partial differential equations

The Hahn–Banach theorem is often useful when one wishes to apply the method of a priori estimates. Suppose that we wish to solve the linear differential equation [math]\displaystyle{ P u = f }[/math] for [math]\displaystyle{ u, }[/math] with [math]\displaystyle{ f }[/math] given in some Banach space X. If we have control on the size of [math]\displaystyle{ u }[/math] in terms of [math]\displaystyle{ \|f\|_X }[/math] and we can think of [math]\displaystyle{ u }[/math] as a bounded linear functional on some suitable space of test functions [math]\displaystyle{ g, }[/math] then we can view [math]\displaystyle{ f }[/math] as a linear functional by adjunction: [math]\displaystyle{ (f, g) = (u, P^*g). }[/math] At first, this functional is only defined on the image of [math]\displaystyle{ P, }[/math] but using the Hahn–Banach theorem, we can try to extend it to the entire codomain X. The resulting functional is often defined to be a weak solution to the equation.

Characterizing reflexive Banach spaces

Theorem[26] — A real Banach space is reflexive if and only if every pair of non-empty disjoint closed convex subsets, one of which is bounded, can be strictly separated by a hyperplane.

Example from Fredholm theory

To illustrate an actual application of the Hahn–Banach theorem, we will now prove a result that follows almost entirely from the Hahn–Banach theorem.

Proposition — Suppose [math]\displaystyle{ X }[/math] is a Hausdorff locally convex TVS over the field [math]\displaystyle{ \mathbf{K} }[/math] and [math]\displaystyle{ Y }[/math] is a vector subspace of [math]\displaystyle{ X }[/math] that is TVS–isomorphic to [math]\displaystyle{ \mathbf{K}^I }[/math] for some set [math]\displaystyle{ I. }[/math] Then [math]\displaystyle{ Y }[/math] is a closed and complemented vector subspace of [math]\displaystyle{ X. }[/math]

The above result may be used to show that every closed vector subspace of [math]\displaystyle{ \R^{\N} }[/math] is complemented because any such space is either finite dimensional or else TVS–isomorphic to [math]\displaystyle{ \R^{\N}. }[/math]

Generalizations

General template

There are now many other versions of the Hahn–Banach theorem. The general template for the various versions of the Hahn–Banach theorem presented in this article is as follows:

[math]\displaystyle{ p : X \to \R }[/math] is a sublinear function (possibly a seminorm) on a vector space [math]\displaystyle{ X, }[/math] [math]\displaystyle{ M }[/math] is a vector subspace of [math]\displaystyle{ X }[/math] (possibly closed), and [math]\displaystyle{ f }[/math] is a linear functional on [math]\displaystyle{ M }[/math] satisfying [math]\displaystyle{ |f| \leq p }[/math] on [math]\displaystyle{ M }[/math] (and possibly some other conditions). One then concludes that there exists a linear extension [math]\displaystyle{ F }[/math] of [math]\displaystyle{ f }[/math] to [math]\displaystyle{ X }[/math] such that [math]\displaystyle{ |F| \leq p }[/math] on [math]\displaystyle{ X }[/math] (possibly with additional properties).

Theorem[3] — If [math]\displaystyle{ D }[/math] is an absorbing disk in a real or complex vector space [math]\displaystyle{ X }[/math] and if [math]\displaystyle{ f }[/math] be a linear functional defined on a vector subspace [math]\displaystyle{ M }[/math] of [math]\displaystyle{ X }[/math] such that [math]\displaystyle{ |f| \leq 1 }[/math] on [math]\displaystyle{ M \cap D, }[/math] then there exists a linear functional [math]\displaystyle{ F }[/math] on [math]\displaystyle{ X }[/math] extending [math]\displaystyle{ f }[/math] such that [math]\displaystyle{ |F| \leq 1 }[/math] on [math]\displaystyle{ D. }[/math]

For seminorms

Hahn–Banach theorem for seminorms[27][28] — If [math]\displaystyle{ p : M \to \Reals }[/math] is a seminorm defined on a vector subspace [math]\displaystyle{ M }[/math] of [math]\displaystyle{ X, }[/math] and if [math]\displaystyle{ q : X \to \Reals }[/math] is a seminorm on [math]\displaystyle{ X }[/math] such that [math]\displaystyle{ p \leq q\big\vert_M, }[/math] then there exists a seminorm [math]\displaystyle{ P : X \to \Reals }[/math] on [math]\displaystyle{ X }[/math] such that [math]\displaystyle{ P\big\vert_M = p }[/math] on [math]\displaystyle{ M }[/math] and [math]\displaystyle{ P \leq q }[/math] on [math]\displaystyle{ X. }[/math]

So for example, suppose that [math]\displaystyle{ f }[/math] is a bounded linear functional defined on a vector subspace [math]\displaystyle{ M }[/math] of a normed space [math]\displaystyle{ X, }[/math] so its the operator norm [math]\displaystyle{ \|f\| }[/math] is a non-negative real number. Then the linear functional's absolute value [math]\displaystyle{ p := |f| }[/math] is a seminorm on [math]\displaystyle{ M }[/math] and the map [math]\displaystyle{ q : X \to \Reals }[/math] defined by [math]\displaystyle{ q(x) = \|f\| \, \|x\| }[/math] is a seminorm on [math]\displaystyle{ X }[/math] that satisfies [math]\displaystyle{ p \leq q\big\vert_M }[/math] on [math]\displaystyle{ M. }[/math] The Hahn–Banach theorem for seminorms guarantees the existence of a seminorm [math]\displaystyle{ P : X \to \Reals }[/math] that is equal to [math]\displaystyle{ |f| }[/math] on [math]\displaystyle{ M }[/math] (since [math]\displaystyle{ P\big\vert_M = p = |f| }[/math]) and is bounded above by [math]\displaystyle{ P(x) \leq \|f\| \, \|x\| }[/math] everywhere on [math]\displaystyle{ X }[/math] (since [math]\displaystyle{ P \leq q }[/math]).

Geometric separation

Hahn–Banach sandwich theorem[3] — Let [math]\displaystyle{ p : X \to \R }[/math] be a sublinear function on a real vector space [math]\displaystyle{ X, }[/math] let [math]\displaystyle{ S \subseteq X }[/math] be any subset of [math]\displaystyle{ X, }[/math] and let [math]\displaystyle{ f : S \to \R }[/math] be any map. If there exist positive real numbers [math]\displaystyle{ a }[/math] and [math]\displaystyle{ b }[/math] such that [math]\displaystyle{ 0 \geq \inf_{s \in S} [p(s - a x - b y) - f(s) - a f(x) - b f(y)] \qquad \text{ for all } x, y \in S, }[/math] then there exists a linear functional [math]\displaystyle{ F : X \to \R }[/math] on [math]\displaystyle{ X }[/math] such that [math]\displaystyle{ F \leq p }[/math] on [math]\displaystyle{ X }[/math] and [math]\displaystyle{ f \leq F \leq p }[/math] on [math]\displaystyle{ S. }[/math]

Maximal dominated linear extension

Theorem[3] (Andenaes, 1970) — Let [math]\displaystyle{ p : X \to \R }[/math] be a sublinear function on a real vector space [math]\displaystyle{ X, }[/math] let [math]\displaystyle{ f : M \to \R }[/math] be a linear functional on a vector subspace [math]\displaystyle{ M }[/math] of [math]\displaystyle{ X }[/math] such that [math]\displaystyle{ f \leq p }[/math] on [math]\displaystyle{ M, }[/math] and let [math]\displaystyle{ S \subseteq X }[/math] be any subset of [math]\displaystyle{ X. }[/math] Then there exists a linear functional [math]\displaystyle{ F : X \to \R }[/math] on [math]\displaystyle{ X }[/math] that extends [math]\displaystyle{ f, }[/math] satisfies [math]\displaystyle{ F \leq p }[/math] on [math]\displaystyle{ X, }[/math] and is (pointwise) maximal on [math]\displaystyle{ S }[/math] in the following sense: if [math]\displaystyle{ \widehat{F} : X \to \R }[/math] is a linear functional on [math]\displaystyle{ X }[/math] that extends [math]\displaystyle{ f }[/math] and satisfies [math]\displaystyle{ \widehat{F} \leq p }[/math] on [math]\displaystyle{ X, }[/math] then [math]\displaystyle{ F \leq \widehat{F} }[/math] on [math]\displaystyle{ S }[/math] implies [math]\displaystyle{ F = \widehat{F} }[/math] on [math]\displaystyle{ S. }[/math]

If [math]\displaystyle{ S = \{s\} }[/math] is a singleton set (where [math]\displaystyle{ s \in X }[/math] is some vector) and if [math]\displaystyle{ F : X \to \R }[/math] is such a maximal dominated linear extension of [math]\displaystyle{ f : M \to \R, }[/math] then [math]\displaystyle{ F(s) = \inf_{m \in M} [f(s) + p(s - m)]. }[/math][3]

Vector valued Hahn–Banach

Vector–valued Hahn–Banach theorem[3] — If [math]\displaystyle{ X }[/math] and [math]\displaystyle{ Y }[/math] are vector spaces over the same field and if [math]\displaystyle{ f : M \to Y }[/math] be a linear map defined on a vector subspace [math]\displaystyle{ M }[/math] of [math]\displaystyle{ X, }[/math] then there exists a linear map [math]\displaystyle{ F : X \to Y }[/math] that extends [math]\displaystyle{ f. }[/math]

Invariant Hahn–Banach

A set [math]\displaystyle{ \Gamma }[/math] of maps [math]\displaystyle{ X \to X }[/math] is commutative (with respect to function composition [math]\displaystyle{ \,\circ\, }[/math]) if [math]\displaystyle{ F \circ G = G \circ F }[/math] for all [math]\displaystyle{ F, G \in \Gamma. }[/math] Say that a function [math]\displaystyle{ f }[/math] defined on a subset [math]\displaystyle{ M }[/math] of [math]\displaystyle{ X }[/math] is [math]\displaystyle{ \Gamma }[/math]-invariant if [math]\displaystyle{ L(M) \subseteq M }[/math] and [math]\displaystyle{ f \circ L = f }[/math] on [math]\displaystyle{ M }[/math] for every [math]\displaystyle{ L \in \Gamma. }[/math]

An invariant Hahn–Banach theorem[29] — Suppose [math]\displaystyle{ \Gamma }[/math] is a commutative set of continuous linear maps from a normed space [math]\displaystyle{ X }[/math] into itself and let [math]\displaystyle{ f }[/math] be a continuous linear functional defined some vector subspace [math]\displaystyle{ M }[/math] of [math]\displaystyle{ X }[/math] that is [math]\displaystyle{ \Gamma }[/math]-invariant, which means that [math]\displaystyle{ L(M) \subseteq M }[/math] and [math]\displaystyle{ f \circ L = f }[/math] on [math]\displaystyle{ M }[/math] for every [math]\displaystyle{ L \in \Gamma. }[/math] Then [math]\displaystyle{ f }[/math] has a continuous linear extension [math]\displaystyle{ F }[/math] to all of [math]\displaystyle{ X }[/math] that has the same operator norm [math]\displaystyle{ \|f\| = \|F\| }[/math] and is also [math]\displaystyle{ \Gamma }[/math]-invariant, meaning that [math]\displaystyle{ F \circ L = F }[/math] on [math]\displaystyle{ X }[/math] for every [math]\displaystyle{ L \in \Gamma. }[/math]

This theorem may be summarized:

Every [math]\displaystyle{ \Gamma }[/math]-invariant continuous linear functional defined on a vector subspace of a normed space [math]\displaystyle{ X }[/math] has a [math]\displaystyle{ \Gamma }[/math]-invariant Hahn–Banach extension to all of [math]\displaystyle{ X. }[/math][29]

For nonlinear functions

The following theorem of Mazur–Orlicz (1953) is equivalent to the Hahn–Banach theorem.

Mazur–Orlicz theorem[30] — Let [math]\displaystyle{ p : X \to \R }[/math] be a sublinear function on a real or complex vector space [math]\displaystyle{ X, }[/math] let [math]\displaystyle{ T }[/math] be any set, and let [math]\displaystyle{ R : T \to \R }[/math] and [math]\displaystyle{ v : T \to X }[/math] be any maps. The following statements are equivalent:

  1. there exists a real-valued linear functional [math]\displaystyle{ F }[/math] on [math]\displaystyle{ X }[/math] such that [math]\displaystyle{ F \leq p }[/math] on [math]\displaystyle{ X }[/math] and [math]\displaystyle{ R \leq F \circ v }[/math] on [math]\displaystyle{ T }[/math];
  2. for any finite sequence [math]\displaystyle{ s_1, \ldots, s_n }[/math] of [math]\displaystyle{ n \gt 0 }[/math] non-negative real numbers, and any sequence [math]\displaystyle{ t_1, \ldots, t_n \in T }[/math] of elements of [math]\displaystyle{ T, }[/math] [math]\displaystyle{ \sum_{i=1}^n s_i R\left(t_i\right) \leq p\left(\sum_{i=1}^n s_i v\left(t_i\right)\right). }[/math]

The following theorem characterizes when any scalar function on [math]\displaystyle{ X }[/math] (not necessarily linear) has a continuous linear extension to all of [math]\displaystyle{ X. }[/math]

Theorem (The extension principle[31]) — Let [math]\displaystyle{ f }[/math] a scalar-valued function on a subset [math]\displaystyle{ S }[/math] of a topological vector space [math]\displaystyle{ X. }[/math] Then there exists a continuous linear functional [math]\displaystyle{ F }[/math] on [math]\displaystyle{ X }[/math] extending [math]\displaystyle{ f }[/math] if and only if there exists a continuous seminorm [math]\displaystyle{ p }[/math] on [math]\displaystyle{ X }[/math] such that [math]\displaystyle{ \left|\sum_{i=1}^n a_i f(s_i)\right| \leq p\left(\sum_{i=1}^n a_is_i\right) }[/math] for all positive integers [math]\displaystyle{ n }[/math] and all finite sequences [math]\displaystyle{ a_1, \ldots, a_n }[/math] of scalars and elements [math]\displaystyle{ s_1, \ldots, s_n }[/math] of [math]\displaystyle{ S. }[/math]

Converse

Let X be a topological vector space. A vector subspace M of X has the extension property if any continuous linear functional on M can be extended to a continuous linear functional on X, and we say that X has the Hahn–Banach extension property (HBEP) if every vector subspace of X has the extension property.[32]

The Hahn–Banach theorem guarantees that every Hausdorff locally convex space has the HBEP. For complete metrizable topological vector spaces there is a converse, due to Kalton: every complete metrizable TVS with the Hahn–Banach extension property is locally convex.[32] On the other hand, a vector space X of uncountable dimension, endowed with the finest vector topology, then this is a topological vector spaces with the Hahn–Banach extension property that is neither locally convex nor metrizable.[32]

A vector subspace M of a TVS X has the separation property if for every element of X such that [math]\displaystyle{ x \not\in M, }[/math] there exists a continuous linear functional [math]\displaystyle{ f }[/math] on X such that [math]\displaystyle{ f(x) \neq 0 }[/math] and [math]\displaystyle{ f(m) = 0 }[/math] for all [math]\displaystyle{ m \in M. }[/math] Clearly, the continuous dual space of a TVS X separates points on X if and only if [math]\displaystyle{ \{0\}, }[/math] has the separation property. In 1992, Kakol proved that any infinite dimensional vector space X, there exist TVS-topologies on X that do not have the HBEP despite having enough continuous linear functionals for the continuous dual space to separate points on X. However, if X is a TVS then every vector subspace of X has the extension property if and only if every vector subspace of X has the separation property.[32]

Relation to axiom of choice and other theorems

The proof of the Hahn–Banach theorem for real vector spaces (HB) commonly uses Zorn's lemma, which in the axiomatic framework of Zermelo–Fraenkel set theory (ZF) is equivalent to the axiom of choice (AC). It was discovered by Łoś and Ryll-Nardzewski[12] and independently by Luxemburg[11] that HB can be proved using the ultrafilter lemma (UL), which is equivalent (under ZF) to the Boolean prime ideal theorem (BPI). BPI is strictly weaker than the axiom of choice and it was later shown that HB is strictly weaker than BPI.[33]

The ultrafilter lemma is equivalent (under ZF) to the Banach–Alaoglu theorem,[34] which is another foundational theorem in functional analysis. Although the Banach–Alaoglu theorem implies HB,[35] it is not equivalent to it (said differently, the Banach–Alaoglu theorem is strictly stronger than HB). However, HB is equivalent to a certain weakened version of the Banach–Alaoglu theorem for normed spaces.[36] The Hahn–Banach theorem is also equivalent to the following statement:[37]

(∗): On every Boolean algebra B there exists a "probability charge", that is: a non-constant finitely additive map from [math]\displaystyle{ B }[/math] into [math]\displaystyle{ [0, 1]. }[/math]

(BPI is equivalent to the statement that there are always non-constant probability charges which take only the values 0 and 1.)

In ZF, the Hahn–Banach theorem suffices to derive the existence of a non-Lebesgue measurable set.[38] Moreover, the Hahn–Banach theorem implies the Banach–Tarski paradox.[39]

For separable Banach spaces, D. K. Brown and S. G. Simpson proved that the Hahn–Banach theorem follows from WKL0, a weak subsystem of second-order arithmetic that takes a form of Kőnig's lemma restricted to binary trees as an axiom. In fact, they prove that under a weak set of assumptions, the two are equivalent, an example of reverse mathematics.[40][41]

See also

Notes

  1. This definition means, for instance, that [math]\displaystyle{ F_b(x) = F_b(0 + 1 x) = f(0) + 1 b = b }[/math] and if [math]\displaystyle{ m \in M }[/math] then [math]\displaystyle{ F_b(m) = F_b(m + 0 x) = f(m) + 0 b = f(m). }[/math] In fact, if [math]\displaystyle{ G : M \oplus \R x \to \R }[/math] is any linear extension of [math]\displaystyle{ f }[/math] to [math]\displaystyle{ M \oplus \R x }[/math] then [math]\displaystyle{ G = F_b }[/math] for [math]\displaystyle{ b := G(x). }[/math] In other words, every linear extension of [math]\displaystyle{ f }[/math] to [math]\displaystyle{ M \oplus \R x }[/math] is of the form [math]\displaystyle{ F_b }[/math] for some (unique) [math]\displaystyle{ b. }[/math]
  2. Explicitly, for any real number [math]\displaystyle{ b \in \R, }[/math] [math]\displaystyle{ F_b \leq p }[/math] on [math]\displaystyle{ M \oplus \R x }[/math] if and only if [math]\displaystyle{ a \leq b \leq c. }[/math] Combined with the fact that [math]\displaystyle{ F_b(x) = b, }[/math] it follows that the dominated linear extension of [math]\displaystyle{ f }[/math] to [math]\displaystyle{ M \oplus \R x }[/math] is unique if and only if [math]\displaystyle{ a = c, }[/math] in which case this scalar will be the extension's values at [math]\displaystyle{ x. }[/math] Since every linear extension of [math]\displaystyle{ f }[/math] to [math]\displaystyle{ M \oplus \R x }[/math] is of the form [math]\displaystyle{ F_b }[/math] for some [math]\displaystyle{ b, }[/math] the bounds [math]\displaystyle{ a \leq b = F_b(x) \leq c }[/math] thus also limit the range of possible values (at [math]\displaystyle{ x }[/math]) that can be taken by any of [math]\displaystyle{ f }[/math]'s dominated linear extensions. Specifically, if [math]\displaystyle{ F : X \to \R }[/math] is any linear extension of [math]\displaystyle{ f }[/math] satisfying [math]\displaystyle{ F \leq p }[/math] then for every [math]\displaystyle{ x \in X \setminus M, }[/math] [math]\displaystyle{ \sup_{m \in M}[-p(-m - x) - f(m)] ~\leq~ F(x) ~\leq~ \inf_{m \in M} [p(m + x) - f(m)]. }[/math]
  3. Geometric illustration: The geometric idea of the above proof can be fully presented in the case of [math]\displaystyle{ X = \R^2, M = \{(x, 0) : x \in \R\}. }[/math] First, define the simple-minded extension [math]\displaystyle{ f_0(x, y) = f(x), }[/math] It doesn't work, since maybe [math]\displaystyle{ f_0 \leq p }[/math]. But it is a step in the right direction. [math]\displaystyle{ p-f_0 }[/math] is still convex, and [math]\displaystyle{ p-f_0 \geq f-f_0. }[/math] Further, [math]\displaystyle{ f-f_0 }[/math] is identically zero on the x-axis. Thus we have reduced to the case of [math]\displaystyle{ f = 0, p \geq 0 }[/math] on the x-axis. If [math]\displaystyle{ p \geq 0 }[/math] on [math]\displaystyle{ \R^2, }[/math] then we are done. Otherwise, pick some [math]\displaystyle{ v \in \R^2, }[/math] such that [math]\displaystyle{ p(v) \lt 0. }[/math] The idea now is to perform a simultaneous bounding of [math]\displaystyle{ p }[/math] on [math]\displaystyle{ v + M }[/math] and [math]\displaystyle{ -v+M }[/math] such that [math]\displaystyle{ p \geq b }[/math] on [math]\displaystyle{ v+M }[/math] and [math]\displaystyle{ p \geq -b }[/math] on [math]\displaystyle{ -v+M, }[/math] then defining [math]\displaystyle{ \tilde f(w + rv) = rb }[/math] would give the desired extension. Since [math]\displaystyle{ -v+M, v+M }[/math] are on opposite sides of [math]\displaystyle{ M, }[/math] and [math]\displaystyle{ p \lt 0 }[/math] at some point on [math]\displaystyle{ v+M, }[/math] by convexity of [math]\displaystyle{ p, }[/math] we must have [math]\displaystyle{ p \geq 0 }[/math] on all points on [math]\displaystyle{ -v+M. }[/math] Thus [math]\displaystyle{ \inf_{u\in -v + M} p(u) }[/math] is finite. Geometrically, this works because [math]\displaystyle{ \{z : p(z) \lt 0\} }[/math] is a convex set that is disjoint from [math]\displaystyle{ M, }[/math] and thus must lie entirely on one side of [math]\displaystyle{ M. }[/math] Define [math]\displaystyle{ b = -\inf_{u\in -v + M} p(u). }[/math] This satisfies [math]\displaystyle{ p\geq -b }[/math] on [math]\displaystyle{ -v+M. }[/math] It remains to check the other side. For all [math]\displaystyle{ v+w \in v+M, }[/math] convexity implies that for all [math]\displaystyle{ -v+w' \in -v+M, p(v+w) + p(-v +w') \geq 2p((w+w')/2) = 0, }[/math] thus [math]\displaystyle{ p(v + w) \geq \sup_{u\in -v + M} -p(u) = b. }[/math] Since during the proof, we only used convexity of [math]\displaystyle{ p }[/math], we see that the lemma remains true for merely convex [math]\displaystyle{ p. }[/math]
  4. Like every non-negative scalar multiple of a norm, this seminorm [math]\displaystyle{ \|f\| \, \|\cdot\| }[/math] (the product of the non-negative real number [math]\displaystyle{ \|f\| }[/math] with the norm [math]\displaystyle{ \|\cdot\| }[/math]) is a norm when [math]\displaystyle{ \|f\| }[/math] is positive, although this fact is not needed for the proof.

Proofs

  1. If [math]\displaystyle{ z = a + i b \in \Complex }[/math] has real part [math]\displaystyle{ \operatorname{Re} z = a }[/math] then [math]\displaystyle{ - \operatorname{Re} (i z) = b, }[/math] which proves that [math]\displaystyle{ z = \operatorname{Re} z - i \operatorname{Re} (i z). }[/math] Substituting [math]\displaystyle{ F(x) }[/math] in for [math]\displaystyle{ z }[/math] and using [math]\displaystyle{ i F(x) = F(i x) }[/math] gives [math]\displaystyle{ F(x) = \operatorname{Re} F(x) - i \operatorname{Re} F(i x). }[/math] [math]\displaystyle{ \blacksquare }[/math]
  2. Let [math]\displaystyle{ F }[/math] be any homogeneous scalar-valued map on [math]\displaystyle{ X }[/math] (such as a linear functional) and let [math]\displaystyle{ p : X \to \R }[/math] be any map that satisfies [math]\displaystyle{ p(u x) = p(x) }[/math] for all [math]\displaystyle{ x }[/math] and unit length scalars [math]\displaystyle{ u }[/math] (such as a seminorm). If [math]\displaystyle{ |F| \leq p }[/math] then [math]\displaystyle{ \operatorname{Re} F \leq |\operatorname{Re} F| \leq |F| \leq p. }[/math] For the converse, assume [math]\displaystyle{ \operatorname{Re} F \leq p }[/math] and fix [math]\displaystyle{ x \in X. }[/math] Let [math]\displaystyle{ r = |F(x)| }[/math] and pick any [math]\displaystyle{ \theta \in \R }[/math] such that [math]\displaystyle{ F(x) = r e^{i \theta}; }[/math] it remains to show [math]\displaystyle{ r \leq p(x). }[/math] Homogeneity of [math]\displaystyle{ F }[/math] implies [math]\displaystyle{ F\left(e^{-i \theta} x\right) = r }[/math] is real so that [math]\displaystyle{ \operatorname{Re} F\left(e^{-i \theta} x\right) = F\left(e^{-i \theta} x\right). }[/math] By assumption, [math]\displaystyle{ \operatorname{Re} F \leq p }[/math] and [math]\displaystyle{ p\left(e^{-i \theta} x\right) = p(x), }[/math] so that [math]\displaystyle{ r = \operatorname{Re} F\left(e^{-i \theta} x\right) \leq p\left(e^{-i \theta} x\right) = p(x), }[/math] as desired. [math]\displaystyle{ \blacksquare }[/math]
  3. The map [math]\displaystyle{ F }[/math] being an extension of [math]\displaystyle{ f }[/math] means that [math]\displaystyle{ \operatorname{domain} f \subseteq \operatorname{domain} F }[/math] and [math]\displaystyle{ F(m) = f(m) }[/math] for every [math]\displaystyle{ m \in \operatorname{domain} f. }[/math] Consequently, [math]\displaystyle{ \{|f(m)| : \|m\| \leq 1, m \in \operatorname{domain} f\} = \{|F(m)|: \|m\| \leq 1, m \in \operatorname{domain} f\} \subseteq \{|F(x)\,| : \|x\| \leq 1, x \in \operatorname{domain} F\} }[/math] and so the supremum of the set on the left hand side, which is [math]\displaystyle{ \|f\|, }[/math] does not exceed the supremum of the right hand side, which is [math]\displaystyle{ \|F\|. }[/math] In other words, [math]\displaystyle{ \|f\| \leq \|F\|. }[/math]

References

  1. O'Connor, John J.; Robertson, Edmund F., "Hahn–Banach theorem", MacTutor History of Mathematics archive, University of St Andrews, http://www-history.mcs.st-andrews.ac.uk/Biographies/Helly.html .
  2. See M. Riesz extension theorem. According to Gårding, L. (1970). "Marcel Riesz in memoriam". Acta Math. 124 (1): I–XI. doi:10.1007/bf02394565. , the argument was known to Riesz already in 1918.
  3. 3.00 3.01 3.02 3.03 3.04 3.05 3.06 3.07 3.08 3.09 3.10 3.11 3.12 3.13 3.14 3.15 3.16 3.17 Narici & Beckenstein 2011, pp. 177-220.
  4. 4.0 4.1 4.2 Rudin 1991, pp. 56-62.
  5. Rudin 1991, Th. 3.2
  6. 6.0 6.1 6.2 6.3 6.4 6.5 6.6 6.7 Narici & Beckenstein 2011, pp. 177-183.
  7. 7.0 7.1 7.2 Schechter 1996, pp. 318-319.
  8. 8.0 8.1 8.2 8.3 Reed & Simon 1980.
  9. Rudin 1991, Th. 3.2
  10. 10.0 10.1 Narici & Beckenstein 2011, pp. 126-128.
  11. 11.0 11.1 Luxemburg 1962.
  12. 12.0 12.1 Łoś & Ryll-Nardzewski 1951, pp. 233–237.
  13. HAHNBAN file
  14. Narici & Beckenstein 2011, pp. 182,498.
  15. 15.0 15.1 15.2 Narici & Beckenstein 2011, p. 184.
  16. 16.0 16.1 Narici & Beckenstein 2011, p. 182.
  17. Narici & Beckenstein 2011, p. 126.
  18. Harvey, R.; Lawson, H. B. (1983). "An intrinsic characterisation of Kähler manifolds". Invent. Math. 74 (2): 169–198. doi:10.1007/BF01394312. Bibcode1983InMat..74..169H. 
  19. 19.0 19.1 19.2 Zălinescu, C. (2002). Convex analysis in general vector spaces. River Edge, NJ: World Scientific Publishing Co., Inc. pp. 5–7. ISBN 981-238-067-1. 
  20. Gabriel Nagy, Real Analysis lecture notes
  21. Brezis, Haim (2011). Functional Analysis, Sobolev Spaces, and Partial Differential Equations. New York: Springer. pp. 6–7. 
  22. Kutateladze, Semen (1996). Fundamentals of Functional Analysis. Kluwer Texts in the Mathematical Sciences. 12. pp. 40. doi:10.1007/978-94-015-8755-6. ISBN 978-90-481-4661-1. https://www.researchgate.net/publication/240011075. 
  23. Trèves 2006, p. 184.
  24. Narici & Beckenstein 2011, pp. 195.
  25. Schaefer & Wolff 1999, p. 47.
  26. Narici & Beckenstein 2011, p. 212.
  27. Wilansky 2013, pp. 18-21.
  28. Narici & Beckenstein 2011, pp. 150.
  29. 29.0 29.1 Rudin 1991, p. 141.
  30. Narici & Beckenstein 2011, pp. 177–220.
  31. Edwards 1995, pp. 124-125.
  32. 32.0 32.1 32.2 32.3 Narici & Beckenstein 2011, pp. 225-273.
  33. Pincus 1974, pp. 203–205.
  34. Schechter 1996, pp. 766–767.
  35. Muger, Michael (2020). Topology for the Working Mathematician. 
  36. Bell, J.; Fremlin, David (1972). "A Geometric Form of the Axiom of Choice". Fundamenta Mathematicae 77 (2): 167–170. doi:10.4064/fm-77-2-167-170. http://matwbn.icm.edu.pl/ksiazki/fm/fm77/fm77116.pdf. Retrieved 26 Dec 2021. 
  37. Schechter, Eric. Handbook of Analysis and its Foundations. p. 620. 
  38. Foreman, M.; Wehrung, F. (1991). "The Hahn–Banach theorem implies the existence of a non-Lebesgue measurable set". Fundamenta Mathematicae 138: 13–19. doi:10.4064/fm-138-1-13-19. http://matwbn.icm.edu.pl/ksiazki/fm/fm138/fm13812.pdf. 
  39. Pawlikowski, Janusz (1991). "The Hahn–Banach theorem implies the Banach–Tarski paradox". Fundamenta Mathematicae 138: 21–22. doi:10.4064/fm-138-1-21-22. 
  40. Brown, D. K.; Simpson, S. G. (1986). "Which set existence axioms are needed to prove the separable Hahn–Banach theorem?". Annals of Pure and Applied Logic 31: 123–144. doi:10.1016/0168-0072(86)90066-7.  Source of citation.
  41. Simpson, Stephen G. (2009), Subsystems of second order arithmetic, Perspectives in Logic (2nd ed.), Cambridge University Press, ISBN:978-0-521-88439-6, MR2517689

Bibliography