Logarithmic norm

From HandWiki

In mathematics, the logarithmic norm is a real-valued functional on operators, and is derived from either an inner product, a vector norm, or its induced operator norm. The logarithmic norm was independently introduced by Germund Dahlquist[1] and Sergei Lozinskiĭ in 1958, for square matrices. It has since been extended to nonlinear operators and unbounded operators as well.[2] The logarithmic norm has a wide range of applications, in particular in matrix theory, differential equations and numerical analysis. In the finite-dimensional setting, it is also referred to as the matrix measure or the Lozinskiĭ measure.

Original definition

Let [math]\displaystyle{ A }[/math] be a square matrix and [math]\displaystyle{ \| \cdot \| }[/math] be an induced matrix norm. The associated logarithmic norm [math]\displaystyle{ \mu }[/math] of [math]\displaystyle{ A }[/math] is defined

[math]\displaystyle{ \mu(A) = \lim \limits_{h \rightarrow 0^+} \frac{\| I + hA \| - 1}{h} }[/math]

Here [math]\displaystyle{ I }[/math] is the identity matrix of the same dimension as [math]\displaystyle{ A }[/math], and [math]\displaystyle{ h }[/math] is a real, positive number. The limit as [math]\displaystyle{ h\rightarrow 0^- }[/math] equals [math]\displaystyle{ -\mu(-A) }[/math], and is in general different from the logarithmic norm [math]\displaystyle{ \mu(A) }[/math], as [math]\displaystyle{ -\mu(-A) \leq \mu(A) }[/math] for all matrices.

The matrix norm [math]\displaystyle{ \|A\| }[/math] is always positive if [math]\displaystyle{ A\neq 0 }[/math], but the logarithmic norm [math]\displaystyle{ \mu(A) }[/math] may also take negative values, e.g. when [math]\displaystyle{ A }[/math] is negative definite. Therefore, the logarithmic norm does not satisfy the axioms of a norm. The name logarithmic norm, which does not appear in the original reference, seems to originate from estimating the logarithm of the norm of solutions to the differential equation

[math]\displaystyle{ \dot x = Ax. }[/math]

The maximal growth rate of [math]\displaystyle{ \log \|x\| }[/math] is [math]\displaystyle{ \mu(A) }[/math]. This is expressed by the differential inequality

[math]\displaystyle{ \frac{\mathrm d}{\mathrm d t^+} \log \|x\| \leq \mu(A), }[/math]

where [math]\displaystyle{ \mathrm d/\mathrm dt^+ }[/math] is the upper right Dini derivative. Using logarithmic differentiation the differential inequality can also be written

[math]\displaystyle{ \frac{\mathrm d\|x\|}{\mathrm d t^+} \leq \mu(A)\cdot \|x\|, }[/math]

showing its direct relation to Grönwall's lemma. In fact, it can be shown that the norm of the state transition matrix [math]\displaystyle{ \Phi(t, t_0) }[/math] associated to the differential equation [math]\displaystyle{ \dot x = A(t)x }[/math] is bounded by[3][4]

[math]\displaystyle{ \exp\left(-\int_{t_0}^{t} \mu(-A(s)) ds \right) \le \|\Phi(t,t_0)\| \le \exp\left(\int_{t_0}^{t} \mu(A(s)) ds \right) }[/math]

for all [math]\displaystyle{ t \ge t_0 }[/math].

Alternative definitions

If the vector norm is an inner product norm, as in a Hilbert space, then the logarithmic norm is the smallest number [math]\displaystyle{ \mu(A) }[/math] such that for all [math]\displaystyle{ x }[/math]

[math]\displaystyle{ \real\langle x, Ax\rangle \leq \mu(A)\cdot \|x\|^2 }[/math]

Unlike the original definition, the latter expression also allows [math]\displaystyle{ A }[/math] to be unbounded. Thus differential operators too can have logarithmic norms, allowing the use of the logarithmic norm both in algebra and in analysis. The modern, extended theory therefore prefers a definition based on inner products or duality. Both the operator norm and the logarithmic norm are then associated with extremal values of quadratic forms as follows:

[math]\displaystyle{ \|A\|^2 = \sup_{x\neq 0}{\frac { \langle Ax, Ax\rangle }{ \langle x,x\rangle }}\,; \qquad \mu(A) = \sup_{x\neq 0} {\frac {\real\langle x, Ax\rangle }{ \langle x,x \rangle }} }[/math]

Properties

Basic properties of the logarithmic norm of a matrix include:

  1. [math]\displaystyle{ \mu(zI) = \real\,(z) }[/math]
  2. [math]\displaystyle{ \mu(A) \leq \|A\| }[/math]
  3. [math]\displaystyle{ \mu(\gamma A) = \gamma \mu(A)\, }[/math] for scalar [math]\displaystyle{ \gamma \gt 0 }[/math]
  4. [math]\displaystyle{ \mu(A+zI) = \mu(A) + \real\,(z) }[/math]
  5. [math]\displaystyle{ \mu(A + B) \leq \mu(A) + \mu(B) }[/math]
  6. [math]\displaystyle{ \alpha(A) \leq \mu(A)\, }[/math] where [math]\displaystyle{ \alpha(A) }[/math] is the maximal real part of the eigenvalues of [math]\displaystyle{ A }[/math]
  7. [math]\displaystyle{ \|\mathrm e^{tA}\| \leq \mathrm e^{t\mu(A)}\, }[/math] for [math]\displaystyle{ t \geq 0 }[/math]
  8. [math]\displaystyle{ \mu(A) \lt 0 \, \Rightarrow \, \|A^{-1}\| \leq -1/\mu(A) }[/math]

Example logarithmic norms

The logarithmic norm of a matrix can be calculated as follows for the three most common norms. In these formulas, [math]\displaystyle{ a_{ij} }[/math] represents the element on the [math]\displaystyle{ i }[/math]th row and [math]\displaystyle{ j }[/math]th column of a matrix [math]\displaystyle{ A }[/math].[5]

  • [math]\displaystyle{ \mu_1(A) = \sup \limits_j \left( \real (a_{jj}) + \sum \limits_{ i \neq j} |a_{ij}| \right) }[/math]
  • [math]\displaystyle{ \displaystyle \mu_{2}(A) = \lambda_{max}\left(\frac{A+A^{\mathrm T}}{2}\right) }[/math]
  • [math]\displaystyle{ \mu_{\infty}(A) = \sup \limits_i \left( \real (a_{ii}) + \sum \limits_{ j \neq i} |a_{ij}| \right) }[/math]

Applications in matrix theory and spectral theory

The logarithmic norm is related to the extreme values of the Rayleigh quotient. It holds that

[math]\displaystyle{ -\mu(-A) \leq {\frac {x^{\mathrm T}Ax}{x^{\mathrm T}x}} \leq \mu(A), }[/math]

and both extreme values are taken for some vectors [math]\displaystyle{ x\neq 0 }[/math]. This also means that every eigenvalue [math]\displaystyle{ \lambda_k }[/math] of [math]\displaystyle{ A }[/math] satisfies

[math]\displaystyle{ -\mu(-A) \leq \real\, \lambda_k \leq \mu(A) }[/math].

More generally, the logarithmic norm is related to the numerical range of a matrix.

A matrix with [math]\displaystyle{ -\mu(-A)\gt 0 }[/math] is positive definite, and one with [math]\displaystyle{ \mu(A)\lt 0 }[/math] is negative definite. Such matrices have inverses. The inverse of a negative definite matrix is bounded by

[math]\displaystyle{ \|A^{-1}\|\leq - {\frac {1}{\mu(A)}}. }[/math]

Both the bounds on the inverse and on the eigenvalues hold irrespective of the choice of vector (matrix) norm. Some results only hold for inner product norms, however. For example, if [math]\displaystyle{ R }[/math] is a rational function with the property

[math]\displaystyle{ \real \, (z)\leq 0 \, \Rightarrow \, |R(z)|\leq 1 }[/math]

then, for inner product norms,

[math]\displaystyle{ \mu(A)\leq 0 \, \Rightarrow \, \|R(A)\|\leq 1. }[/math]

Thus the matrix norm and logarithmic norms may be viewed as generalizing the modulus and real part, respectively, from complex numbers to matrices.

Applications in stability theory and numerical analysis

The logarithmic norm plays an important role in the stability analysis of a continuous dynamical system [math]\displaystyle{ \dot x = Ax }[/math]. Its role is analogous to that of the matrix norm for a discrete dynamical system [math]\displaystyle{ x_{n+1} = Ax_n }[/math].

In the simplest case, when [math]\displaystyle{ A }[/math] is a scalar complex constant [math]\displaystyle{ \lambda }[/math], the discrete dynamical system has stable solutions when [math]\displaystyle{ |\lambda|\leq 1 }[/math], while the differential equation has stable solutions when [math]\displaystyle{ \real\,\lambda\leq 0 }[/math]. When [math]\displaystyle{ A }[/math] is a matrix, the discrete system has stable solutions if [math]\displaystyle{ \|A\|\leq 1 }[/math]. In the continuous system, the solutions are of the form [math]\displaystyle{ \mathrm e^{tA}x(0) }[/math]. They are stable if [math]\displaystyle{ \|\mathrm e^{tA}\|\leq 1 }[/math] for all [math]\displaystyle{ t\geq 0 }[/math], which follows from property 7 above, if [math]\displaystyle{ \mu(A)\leq 0 }[/math]. In the latter case, [math]\displaystyle{ \|x\| }[/math] is a Lyapunov function for the system.

Runge–Kutta methods for the numerical solution of [math]\displaystyle{ \dot x = Ax }[/math] replace the differential equation by a discrete equation [math]\displaystyle{ x_{n+1} = R(hA)\cdot x_n }[/math], where the rational function [math]\displaystyle{ R }[/math] is characteristic of the method, and [math]\displaystyle{ h }[/math] is the time step size. If [math]\displaystyle{ |R(z)|\leq 1 }[/math] whenever [math]\displaystyle{ \real\,(z)\leq 0 }[/math], then a stable differential equation, having [math]\displaystyle{ \mu(A)\leq 0 }[/math], will always result in a stable (contractive) numerical method, as [math]\displaystyle{ \|R(hA)\|\leq 1 }[/math]. Runge-Kutta methods having this property are called A-stable.

Retaining the same form, the results can, under additional assumptions, be extended to nonlinear systems as well as to semigroup theory, where the crucial advantage of the logarithmic norm is that it discriminates between forward and reverse time evolution and can establish whether the problem is well posed. Similar results also apply in the stability analysis in control theory, where there is a need to discriminate between positive and negative feedback.

Applications to elliptic differential operators

In connection with differential operators it is common to use inner products and integration by parts. In the simplest case we consider functions satisfying [math]\displaystyle{ u(0)=u(1)=0 }[/math] with inner product

[math]\displaystyle{ \langle u,v\rangle = \int_0^1 uv\, \mathrm dx. }[/math]

Then it holds that

[math]\displaystyle{ \langle u,u''\rangle = -\langle u',u'\rangle \leq -\pi^2\|u\|^2, }[/math]

where the equality on the left represents integration by parts, and the inequality to the right is a Sobolev inequality[citation needed]. In the latter, equality is attained for the function [math]\displaystyle{ \sin\, \pi x }[/math], implying that the constant [math]\displaystyle{ -\pi^2 }[/math] is the best possible. Thus

[math]\displaystyle{ \langle u, Au\rangle \leq -\pi^2 \|u\|^2 }[/math]

for the differential operator [math]\displaystyle{ A=\mathrm d^2/\mathrm dx^2 }[/math], which implies that

[math]\displaystyle{ \mu({\frac {\mathrm d^2}{\mathrm dx^2}}) = -\pi^2. }[/math]

As an operator satisfying [math]\displaystyle{ \langle u,Au \rangle \gt 0 }[/math] is called elliptic, the logarithmic norm quantifies the (strong) ellipticity of [math]\displaystyle{ -\mathrm d^2/\mathrm dx^2 }[/math]. Thus, if [math]\displaystyle{ A }[/math] is strongly elliptic, then [math]\displaystyle{ \mu(-A)\lt 0 }[/math], and is invertible given proper data.

If a finite difference method is used to solve [math]\displaystyle{ -u''=f }[/math], the problem is replaced by an algebraic equation [math]\displaystyle{ Tu=f }[/math]. The matrix [math]\displaystyle{ T }[/math] will typically inherit the ellipticity, i.e., [math]\displaystyle{ -\mu(-T)\gt 0 }[/math], showing that [math]\displaystyle{ T }[/math] is positive definite and therefore invertible.

These results carry over to the Poisson equation as well as to other numerical methods such as the Finite element method.

Extensions to nonlinear maps

For nonlinear operators the operator norm and logarithmic norm are defined in terms of the inequalities

[math]\displaystyle{ l(f)\cdot \|u-v\| \leq \|f(u)-f(v)\| \leq L(f)\cdot \|u-v\|, }[/math]

where [math]\displaystyle{ L(f) }[/math] is the least upper bound Lipschitz constant of [math]\displaystyle{ f }[/math], and [math]\displaystyle{ l(f) }[/math] is the greatest lower bound Lipschitz constant; and

[math]\displaystyle{ m(f)\cdot \|u-v\|^2 \leq \langle u-v, f(u)-f(v)\rangle \leq M(f)\cdot \|u-v\|^2, }[/math]

where [math]\displaystyle{ u }[/math] and [math]\displaystyle{ v }[/math] are in the domain [math]\displaystyle{ D }[/math] of [math]\displaystyle{ f }[/math]. Here [math]\displaystyle{ M(f) }[/math] is the least upper bound logarithmic Lipschitz constant of [math]\displaystyle{ f }[/math], and [math]\displaystyle{ l(f) }[/math] is the greatest lower bound logarithmic Lipschitz constant. It holds that [math]\displaystyle{ m(f)=-M(-f) }[/math] (compare above) and, analogously, [math]\displaystyle{ l(f)=L(f^{-1})^{-1} }[/math], where [math]\displaystyle{ L(f^{-1}) }[/math] is defined on the image of [math]\displaystyle{ f }[/math].

For nonlinear operators that are Lipschitz continuous, it further holds that

[math]\displaystyle{ M(f) = \lim_{h\rightarrow 0^+}{\frac {L(I+hf)-1}{h}}. }[/math]

If [math]\displaystyle{ f }[/math] is differentiable and its domain [math]\displaystyle{ D }[/math] is convex, then

[math]\displaystyle{ L(f) = \sup_{x\in D} \|f'(x)\| }[/math] and [math]\displaystyle{ \displaystyle M(f) = \sup_{x\in D} \mu(f'(x)). }[/math]

Here [math]\displaystyle{ f'(x) }[/math] is the Jacobian matrix of [math]\displaystyle{ f }[/math], linking the nonlinear extension to the matrix norm and logarithmic norm.

An operator having either [math]\displaystyle{ m(f) \gt 0 }[/math] or [math]\displaystyle{ M(f) \lt 0 }[/math] is called uniformly monotone. An operator satisfying [math]\displaystyle{ L(f) \lt 1 }[/math] is called contractive. This extension offers many connections to fixed point theory, and critical point theory.

The theory becomes analogous to that of the logarithmic norm for matrices, but is more complicated as the domains of the operators need to be given close attention, as in the case with unbounded operators. Property 8 of the logarithmic norm above carries over, independently of the choice of vector norm, and it holds that

[math]\displaystyle{ M(f)\lt 0\,\Rightarrow\,L(f^{-1})\leq -{\frac {1}{M(f)}}, }[/math]

which quantifies the Uniform Monotonicity Theorem due to Browder & Minty (1963).

References

  1. Germund Dahlquist, "Stability and error bounds in the numerical integration of ordinary differential equations", Almqvist & Wiksell, Uppsala 1958
  2. Gustaf Söderlind, "The logarithmic norm. History and modern theory", BIT Numerical Mathematics, 46(3):631-652, 2006
  3. Desoer, C.; Haneda, H. (1972). "The measure of a matrix as a tool to analyze computer algorithms for circuit analysis". IEEE Transactions on Circuit Theory 19 (5): 480–486. doi:10.1109/tct.1972.1083507. 
  4. Desoer, C. A.; Vidyasagar, M. (1975). Feedback Systems: Input-output Properties. New York: Elsevier. p. 34. ISBN 9780323157797. 
  5. Desoer, C. A.; Vidyasagar, M. (1975). Feedback Systems: Input-output Properties. New York: Elsevier. p. 33. ISBN 9780323157797.