Operator norm

Short description: Measure of the "size" of linear operators

In mathematics, the operator norm measures the "size" of certain linear operators by assigning each a real number called its operator norm. Formally, it is a norm defined on the space of bounded linear operators between two given normed vector spaces. Informally, the operator norm $\displaystyle{ \|T\| }$ of a linear map $\displaystyle{ T : X \to Y }$ is the maximum factor by which it "lengthens" vectors.

Introduction and definition

Given two normed vector spaces $\displaystyle{ V }$ and $\displaystyle{ W }$ (over the same base field, either the real numbers $\displaystyle{ \R }$ or the complex numbers $\displaystyle{ \Complex }$), a linear map $\displaystyle{ A : V \to W }$ is continuous if and only if there exists a real number $\displaystyle{ c }$ such that[1] $\displaystyle{ \|Av\| \leq c \|v\| \quad \mbox{ for all } v\in V. }$

The norm on the left is the one in $\displaystyle{ W }$ and the norm on the right is the one in $\displaystyle{ V }$. Intuitively, the continuous operator $\displaystyle{ A }$ never increases the length of any vector by more than a factor of $\displaystyle{ c. }$ Thus the image of a bounded set under a continuous operator is also bounded. Because of this property, the continuous linear operators are also known as bounded operators. In order to "measure the size" of $\displaystyle{ A, }$ one can take the infimum of the numbers $\displaystyle{ c }$ such that the above inequality holds for all $\displaystyle{ v \in V. }$ This number represents the maximum scalar factor by which $\displaystyle{ A }$ "lengthens" vectors. In other words, the "size" of $\displaystyle{ A }$ is measured by how much it "lengthens" vectors in the "biggest" case. So we define the operator norm of $\displaystyle{ A }$ as $\displaystyle{ \|A\|_{op} = \inf\{ c \geq 0 : \|Av\| \leq c \|v\| \mbox{ for all } v \in V \}. }$

The infimum is attained as the set of all such $\displaystyle{ c }$ is closed, nonempty, and bounded from below.[2]

It is important to bear in mind that this operator norm depends on the choice of norms for the normed vector spaces $\displaystyle{ V }$ and $\displaystyle{ W }$.

Examples

Every real $\displaystyle{ m }$-by-$\displaystyle{ n }$ matrix corresponds to a linear map from $\displaystyle{ \R^n }$ to $\displaystyle{ \R^m. }$ Each pair of the plethora of (vector) norms applicable to real vector spaces induces an operator norm for all $\displaystyle{ m }$-by-$\displaystyle{ n }$ matrices of real numbers; these induced norms form a subset of matrix norms.

If we specifically choose the Euclidean norm on both $\displaystyle{ \R^n }$ and $\displaystyle{ \R^m, }$ then the matrix norm given to a matrix $\displaystyle{ A }$ is the square root of the largest eigenvalue of the matrix $\displaystyle{ A^{*} A }$ (where $\displaystyle{ A^{*} }$ denotes the conjugate transpose of $\displaystyle{ A }$).[3] This is equivalent to assigning the largest singular value of $\displaystyle{ A. }$

Passing to a typical infinite-dimensional example, consider the sequence space $\displaystyle{ \ell^2, }$ which is an Lp space, defined by $\displaystyle{ l^2 = \left\{ \left(a_n\right)_{n \geq 1} : \; a_n \in \Complex, \; \sum_n |a_n|^2 \lt \infty \right\}. }$

This can be viewed as an infinite-dimensional analogue of the Euclidean space $\displaystyle{ \Complex^n. }$ Now consider a bounded sequence $\displaystyle{ s_{\bull} = \left(s_n\right)_{n=1}^{\infty}. }$ The sequence $\displaystyle{ s_{\bull} }$ is an element of the space $\displaystyle{ \ell^{\infty}, }$ with a norm given by $\displaystyle{ \left\|s_{\bull}\right\|_{\infty} = \sup _n \left|s_n\right|. }$

Define an operator $\displaystyle{ T_s }$ by pointwise multiplication: $\displaystyle{ \left(a_n\right)_{n=1}^{\infty} \;\stackrel{T_s}{\mapsto}\;\ \left(s_n \cdot a_n\right)_{n=1}^{\infty}. }$

The operator $\displaystyle{ T_s }$ is bounded with operator norm $\displaystyle{ \left\|T_s\right\|_{op} = \left\|s_{\bull}\right\|_{\infty}. }$

This discussion extends directly to the case where $\displaystyle{ \ell^2 }$ is replaced by a general $\displaystyle{ L^p }$ space with $\displaystyle{ p \gt 1 }$ and $\displaystyle{ \ell^{\infty} }$ replaced by $\displaystyle{ L^{\infty}. }$

Equivalent definitions

Let $\displaystyle{ A : V \to W }$ be a linear operator between normed spaces. The first four definitions are always equivalent, and if in addition $\displaystyle{ V \neq \{0\} }$ then they are all equivalent:

\displaystyle{ \begin{alignat}{4} \|A\|_{op} &= \inf &&\{ c \geq 0 ~&&:~ \| A v \| \leq c \| v \| ~&&~ \mbox{ for all } ~&&v \in V \} \\ &= \sup &&\{ \| Av \| ~&&:~ \| v \| \leq 1 ~&&~\mbox{ and } ~&&v \in V \} \\ &= \sup &&\{ \| Av \| ~&&:~ \| v \| \lt 1 ~&&~\mbox{ and } ~&&v \in V \} \\ &= \sup &&\{ \| Av \| ~&&:~ \| v \| \in \{0,1\} ~&&~\mbox{ and } ~&&v \in V \} \\ &= \sup &&\{ \| Av \| ~&&:~ \| v \| = 1 ~&&~\mbox{ and } ~&&v \in V \} \;\;\;\text{ this equality holds if and only if } V \neq \{ 0 \} \\ &= \sup &&\bigg\{ \frac{\| Av \|}{\| v \|} ~&&:~ v \ne 0 ~&&~\mbox{ and } ~&&v \in V \bigg\} \;\;\;\text{ this equality holds if and only if } V \neq \{ 0 \}. \\ \end{alignat} }

If $\displaystyle{ V = \{0\} }$ then the sets in the last two rows will be empty, and consequently their supremums over the set $\displaystyle{ [-\infty, \infty] }$ will equal $\displaystyle{ -\infty }$ instead of the correct value of $\displaystyle{ 0. }$ If the supremum is taken over the set $\displaystyle{ [0, \infty] }$ instead, then the supremum of the empty set is $\displaystyle{ 0 }$ and the formulas hold for any $\displaystyle{ V. }$

Importantly, a linear operator $\displaystyle{ A : V \to W }$ is not, in general, guaranteed to achieve its norm $\displaystyle{ \|A\|_{op} = \sup \{\|A v\| : \|v\| \leq 1, v \in V\} }$ on the closed unit ball $\displaystyle{ \{v \in V : \|v\| \leq 1\}, }$ meaning that there might not exist any vector $\displaystyle{ u \in V }$ of norm $\displaystyle{ \|u\| \leq 1 }$ such that $\displaystyle{ \|A\|_{op} = \|A u\| }$ (if such a vector does exist and if $\displaystyle{ A \neq 0, }$ then $\displaystyle{ u }$ would necessarily have unit norm $\displaystyle{ \|u\| = 1 }$). R.C. James proved James's theorem in 1964, which states that a Banach space $\displaystyle{ V }$ is reflexive if and only if every bounded linear functional $\displaystyle{ f \in V^* }$ achieves its norm on the closed unit ball.[4] It follows, in particular, that every non-reflexive Banach space has some bounded linear functional (a type of bounded linear operator) that does not achieve its norm on the closed unit ball.

If $\displaystyle{ A : V \to W }$ is bounded then[5] $\displaystyle{ \|A\|_{op} = \sup \left\{\left|w^*(A v)\right| : \|v\| \leq 1, \left\|w^*\right\| \leq 1 \text{ where } v \in V, w^* \in W^*\right\} }$ and[5] $\displaystyle{ \|A\|_{op} = \left\|{}^tA\right\|_{op} }$ where $\displaystyle{ {}^t A : W^* \to V^* }$ is the transpose of $\displaystyle{ A : V \to W, }$ which is the linear operator defined by $\displaystyle{ w^* \,\mapsto\, w^* \circ A. }$

Properties

The operator norm is indeed a norm on the space of all bounded operators between $\displaystyle{ V }$ and $\displaystyle{ W }$. This means $\displaystyle{ \|A\|_{op} \geq 0 \mbox{ and } \|A\|_{op} = 0 \mbox{ if and only if } A = 0, }$ $\displaystyle{ \|aA\|_{op} = |a| \|A\|_{op} \mbox{ for every scalar } a , }$ $\displaystyle{ \|A + B\|_{op} \leq \|A\|_{op} + \|B\|_{op}. }$

The following inequality is an immediate consequence of the definition: $\displaystyle{ \|Av\| \leq \|A\|_{op} \|v\| \ \mbox{ for every }\ v \in V. }$

The operator norm is also compatible with the composition, or multiplication, of operators: if $\displaystyle{ V }$, $\displaystyle{ W }$ and $\displaystyle{ X }$ are three normed spaces over the same base field, and $\displaystyle{ A : V \to W }$ and $\displaystyle{ B : W \to X }$ are two bounded operators, then it is a sub-multiplicative norm, that is: $\displaystyle{ \|BA\|_{op} \leq \|B\|_{op} \|A\|_{op}. }$

For bounded operators on $\displaystyle{ V }$, this implies that operator multiplication is jointly continuous.

It follows from the definition that if a sequence of operators converges in operator norm, it converges uniformly on bounded sets.

Table of common operator norms

By choosing different norms for the codomain, used in computing $\displaystyle{ \|Av\| }$, and the domain, used in computing $\displaystyle{ \|v\| }$, we obtain different values for the operator norm. Some common operator norms are easy to calculate, and others are NP-hard. Except for the NP-hard norms, all these norms can be calculated in $\displaystyle{ N^2 }$ operations (for an $\displaystyle{ N \times N }$ matrix), with the exception of the $\displaystyle{ \ell_2 - \ell_2 }$ norm (which requires $\displaystyle{ N^3 }$ operations for the exact answer, or fewer if you approximate it with the power method or Lanczos iterations).

Computability of Operator Norms[6]
Co-domain
$\displaystyle{ \ell_1 }$ $\displaystyle{ \ell_2 }$ $\displaystyle{ \ell_\infty }$
Domain $\displaystyle{ \ell_1 }$ Maximum $\displaystyle{ \ell_1 }$ norm of a column Maximum $\displaystyle{ \ell_2 }$ norm of a column Maximum $\displaystyle{ \ell_{\infty} }$ norm of a column
$\displaystyle{ \ell_2 }$ NP-hard Maximum singular value Maximum $\displaystyle{ \ell_2 }$ norm of a row
$\displaystyle{ \ell_\infty }$ NP-hard NP-hard Maximum $\displaystyle{ \ell_1 }$ norm of a row

The norm of the adjoint or transpose can be computed as follows. We have that for any $\displaystyle{ p, q, }$ then $\displaystyle{ \|A\|_{p\rightarrow q} = \|A^*\|_{q'\rightarrow p'} }$ where $\displaystyle{ p', q' }$ are Hölder conjugate to $\displaystyle{ p, q, }$ that is, $\displaystyle{ 1/p + 1/p' = 1 }$ and $\displaystyle{ 1/q + 1/q' = 1. }$

Operators on a Hilbert space

Suppose $\displaystyle{ H }$ is a real or complex Hilbert space. If $\displaystyle{ A : H \to H }$ is a bounded linear operator, then we have $\displaystyle{ \|A\|_{op} = \left\|A^*\right\|_{op} }$ and $\displaystyle{ \left\|A^* A\right\|_{op} = \|A\|_{op}^2, }$ where $\displaystyle{ A^{*} }$ denotes the adjoint operator of $\displaystyle{ A }$ (which in Euclidean spaces with the standard inner product corresponds to the conjugate transpose of the matrix $\displaystyle{ A }$).

In general, the spectral radius of $\displaystyle{ A }$ is bounded above by the operator norm of $\displaystyle{ A }$: $\displaystyle{ \rho(A) \leq \|A\|_{op}. }$

To see why equality may not always hold, consider the Jordan canonical form of a matrix in the finite-dimensional case. Because there are non-zero entries on the superdiagonal, equality may be violated. The quasinilpotent operators is one class of such examples. A nonzero quasinilpotent operator $\displaystyle{ A }$ has spectrum $\displaystyle{ \{0\}. }$ So $\displaystyle{ \rho(A) = 0 }$ while $\displaystyle{ \|A\|_{op} \gt 0. }$

However, when a matrix $\displaystyle{ N }$ is normal, its Jordan canonical form is diagonal (up to unitary equivalence); this is the spectral theorem. In that case it is easy to see that $\displaystyle{ \rho(N) = \|N\|_{op}. }$

This formula can sometimes be used to compute the operator norm of a given bounded operator $\displaystyle{ A }$: define the Hermitian operator $\displaystyle{ B = A^{*} A, }$ determine its spectral radius, and take the square root to obtain the operator norm of $\displaystyle{ A. }$

The space of bounded operators on $\displaystyle{ H, }$ with the topology induced by operator norm, is not separable. For example, consider the Lp space $\displaystyle{ L^2[0, 1], }$ which is a Hilbert space. For $\displaystyle{ 0 \lt t \leq 1, }$ let $\displaystyle{ \Omega_t }$ be the characteristic function of $\displaystyle{ [0, t], }$ and $\displaystyle{ P_t }$ be the multiplication operator given by $\displaystyle{ \Omega_t, }$ that is, $\displaystyle{ P_t (f) = f \cdot \Omega_t. }$

Then each $\displaystyle{ P_t }$ is a bounded operator with operator norm 1 and $\displaystyle{ \left\|P_t - P_s\right\|_{op} = 1 \quad \mbox{ for all } \quad t \neq s. }$

But $\displaystyle{ \{P_t : 0 \lt t \leq 1\} }$ is an uncountable set. This implies the space of bounded operators on $\displaystyle{ L^2([0, 1]) }$ is not separable, in operator norm. One can compare this with the fact that the sequence space $\displaystyle{ \ell^{\infty} }$ is not separable.

The associative algebra of all bounded operators on a Hilbert space, together with the operator norm and the adjoint operation, yields a C*-algebra.