Logarithmic norm

From HandWiki
Short description: Mathematical function often applied to matrices

In mathematics, the logarithmic norm is a real-valued functional on operators, constructed from either a vector norm or an inner product, or directly from the induced operator norm. It quantifies key notions such as positive/negative definiteness in matrix theory, uniformly coercive or monotone vector fields in nonlinear analysis, and strong ellipticity in differential operators on function spaces, subject to specific boundary conditions.

The logarithmic norm has a wide range of applications in matrix theory, stability theory for initial and boundary value problems in differential equations, in nonlinear analysis, applied mathematics, and numerical analysis.

Comprehensive treatments of the theory and its applications can be found in recent monographs,[1][2] on which this article is based.

History, original definition, and classical notation

The logarithmic norm was introduced in 1958, independently by Germund Dahlquist[3] and Sergei Lozinskiĭ[4] for square matrices and bounded linear operators. The original purpose was to estimate solutions to linear differential equations x˙=Ax+r, to construct sufficient conditions for stability, and to obtain norm bounds of perturbations due to the forcing function r.

Let A be a square matrix and be the operator norm induced by a given vector norm. The associated logarithmic norm μ[A] is defined by

μ[A]=limh0+I+hA1h,

where h is real and I is the identity matrix of the same dimension as A. The limit exists on account of the convexity of the matrix norm.

The term logarithmic norm is due to Lozinskiĭ, and refers to the fact that if x˙=Ax then

Dt+logxμ[A],

where Dt+ denotes the upper right Dini derivative with respect to time. In other words, μ[A] is an upper bound for the (short-term) growth rate of the "logarithmic norm" of x(t). Using logarithmic differentiation, this bound can also be written as the differential inequality

Dt+xμ[A]x,

from which it follows that, for t0,

etAetμ[A].

This bound is always preferable to the Lipschitz bound etA, since μ[A]A. Moreover, while A0, the logarithmic norm may take negative values, e.g. when A is negative definite and the Euclidean norm is used. Thus μ[A]0 implies that etA is a contraction, and hence the solutions x(t) are stable in forward time.

The ability of the logarithmic norm to distinguish forward and reverse time evolution in dynamical systems is a key property. It may also establish whether a problem is well-posed. This asymmetry is also of importance in many applications. For example, in control theory, there is a fundamental need to discriminate between positive and negative feedback; this too can be characterized by the logarithmic norm.

Depending on the setting, the logarithmic norm is related to Lyapunov stability theory.[5] Other bounds are similar to those obtained using Grönwall's lemma. Thus the state transition matrix Φ(t,t0) associated with a nonautonomous linear differential equation x˙=A(t)x satisfies the upper and lower bounds[6][7] for tt0,

exp(t0tμ[A(s)]ds)Φ(t,t0)exp(t0tμ[A(s)]ds).

More recently, the logarithmic norm has been extended and generalized to nonlinear maps[8] as well as to unbounded operators,[9] covering boundary value problems and elementary applications in elliptic and parabolic partial differential equations.[1]

The logarithmic norm may in some literature appear under less informative names, such as the "matrix measure" (for matrices)[10] and "one-sided Lipschitz constant" (for nonlinear maps),[11] although the concept neither relates to measure theory nor implies Lipschitz continuity.

Extended definitions and modern notation

For clarity and generality, the modern notation defines a greatest lower bound (or simply lower) logarithmic norm m[A] as well as the classical least upper bound (upper) logarithmic norm by the two left and right limits

m[A]=limh0I+hA1h;M[A]=limh0+I+hA1h.

Here m[A]M[A]μ[A] for all A, and the lower and upper logarithmic norms are related by m[A]M[A].

Alternatively, if an inner product norm is used, the lower and upper logarithmic norms are defined as the best possible bounds such that for all x,

m2[A]x22Rex,AxM2[A]x22,

where x22=x,x denotes the corresponding vector norm. In case of bounded linear operators in a Hilbert space setting, the inner product definition is equivalent to the original definition. But as the inner product definition also admits certain classes of unbounded operators it is more general. In the latter case one of the logarithmic norms may be infinite.

Further variants of use in semigroup theory are based on the limit

M[A]=limh0+ehA1h,

which is still equivalent to the classical definition if A is a bounded linear operator. In case etA is a strongly continuous semigroup for t0, various modifications of this alternative are necessary, since the "vector field" A:xy=Ax is defined by the infinitesimal generator of the semigroup, i.e.,

Ax=limh0+(ehAI)xh.

The structure of the limit using the matrix exponential is shared by many other pairs of functionals, linking the vector field and the flow of a linear differential equation in a natural manner. For example, it holds that

α[A]=limh0+ρ[ehA]1h;trace[A]=limh0det[ehA]1h,

where ρ[A]=maxk|λk| and α[A]=maxkReλk are the (upper) spectral radius and (upper) spectral abscissa, respectively. Both limits exist, although neither the spectral radius nor the determinant is a convex functional. Moreover, for the trace, the left and right limits coincide. Moreover,

Dtdet[etA]=trace[A]det[etA].

In spite of the similarities, there are major differences. Thus Dt+etAM[A]etA, where the important properties, enabling norm bounds, are that the operator norm is submultiplicative, and the logarithmic norm is subadditive. These properties do not hold for the spectral radius and abscissa, which are less powerful tools in the stability analysis of dynamical systems.

Properties and computation of logarithmic norms

For every choice of norm, the basic properties (including some redundancy) of the logarithmic norm of a matrix are the following:[1]

  1. M[0]=0,M[I]=1
  2. M[A]=m[A]
  3. Boundedness Am[A]M[A]A
  4. Lower and upper bounds glb[A]M[A]A
  5. Lower and upper bounds Am[A]glb[A]
  6. Continuity |M[A]M[B]|AB
  7. Real translation M[AλI]=M[A]Reλ,λ
  8. Weak homogeneity M[γA]=γM[A],γ+
  9. Sublinearity M[A+B]M[A]+M[B]
  10. Superlinearity m[A]+m[B]m[A+B]
  11. Convexity M[θA+(1θ)B]θM[A]+(1θ)M[B],θ[0,1]
  12. Concavity θm[A]+(1θ)m[B]m[θA+(1θ)B],θ[0,1]
  13. Lower and upper exponential bounds etm[A]etAetM[A],t0
  14. Positive definite inverse bound m[A]>0A11/m[A]
  15. Negative definite inverse bound M[A]<0A11/M[A]
  16. Spectral bound α[A]M[A].

As illustrated by properties 14 - 15, the condition m[A]>0 (M[A]<0) generalizes positive (negative) definiteness to arbitrary norms and non-symmetric matrices, while also quantifying these notions. An indefinite matrix with respect to a given norm satisfies m[A]0M[A].

For the most common lp norms, the logarithmic norm of a (complex) matrix A can be computed as follows, where aij represents the matrix element in the ith row and jth column, and where A* denotes the complex conjugate of A :

  • M1[A]=maxj(Reajj+ij|aij|)
  • M2[A]=α[A+A*2]
  • M[A]=maxi(Reaii+ji|aij|)

Applications in matrix theory and spectral theory

Let A be a (possibly complex) n×n matrix. The symmetric positive definite square root of A*A and the Hermitian part of A are denoted by

|A|=A*AandHe(A)=A+A*2,

respectively. Their eigenvalue problems are |A|u=σu and He(A)v=μv, where σk are the n singular values of A, and μk are the n logarithmic values of A. The singular and logarithmic values are the n stationary points of the two quadratic forms

u*|A|uu*uandRev*Avv*vv*He(A)vv*v,

respectively. As for the relations to the n eigenvalues λk of A, it holds that

glb2[A]=minσkmin|λk|max|λk|maxσk=A2

and

m2[A]=minμkminReλkmaxReλkmaxμk=M2[A]

together with

|detA|=k|λk|=kσk=det|A|

and

Retrace[A]=kReλk=kμk=trace[He(A)].

As the quadratic form x*Ax is generally complex valued, one can only find extrema of some related real function. For the logarithmic norms one takes the real part. Another option is to take the absolute value, to obtain the numerical radius of A, defined by

ν[A]=supx0|x*Ax|x*x.

Because Rez|z|, it follows that for all A,

M2[A]ν[A]A2.

The numerical radius is a matrix norm (but not an operator norm). However, as it is not defined by a quadratic form, the determination of ν[A] is more intricate. But to each matrix norm there is a logarithmic norm, and

limh0ν[I+hA]1h=m2[A];limh0+ν[I+hA]1h=M2[A],

i.e., the lower and upper logarithmic norms of the numerical radius coincide with the Euclidean logarithmic norms.[1] By the Hausdorff-Toeplitz theorem, the numerical range of a matrix A is the set

W(A)=φ[0,2π]C(A;φ)

in the complex plane, given by the intersection of the half-planes

C(A;φ)={z:ReeiφzM2[eiφA]},

and the numerical radius is

ν[A]=supzW(A)|z|=maxφ[0,2π]M2[eiφA].

For matrix functions, the use of the von Neumann inequality and spectral sets provides the following classical result, using the Euclidean norm: let a polynomial P be a such that |z|1|P(z)|1. Then A21P(A)21.

Using the logarithmic norm, this theorem can be generalized from unit circles to half-planes. Thus, if R is a rational function such that Rez0|R(z)|1, then M2[A]0R(A)21.

This property is of importance in numerical analysis, where implicit Runge-Kutta methods approximate a differential equation x˙=Ax by a recursion xk+1=R(hA)xk, where R is a rational function, referred to as the method's stability function. The continuous system is stable with a contractive flow if M2[A]0, while the discrete system is stable (a discrete contraction) if R(hA)21. Thus, if the stability function R satisfies Rez0|R(z)|1, a condition known as A-stability, then the Runge-Kutta method produces stable solutions for all negative semidefinite vector fields A.

In a similar manner the theory of logarithmic norms can also be extended to unitarily invariant norms, e.g. the Hilbert-Schmidt norm for trace class operators, the Frobenius norm, and more generally the Schatten norms. As logarithmic values are only invariant under unitary similarity transformations, the logarithmic norms are defined subordinate to the matrix norms.

Extensions to nonlinear maps

For real-valued nonlinear maps there are various possibilities to define operator norms. The least upper bound Lipschitz constant is an operator seminorm, which is all that is required. A corresponding lower Lipschitz constant can be included to define

l[f]uvf(u)f(v)L[f]uv

on a simply connected domain D. The corresponding lower and upper logarithmic Lipschitz constants are defined by

m[f]=limh0L[I+hf]1h;M[f]=limh0+L[I+hf]1h,

where both limits exist if f is Lipschitz continuous. For inner product norms, one can instead use the special definitions

m2[f]uv22uv,f(u)f(v)M2[f]uv22,

where one or both of the logarithmic Lipschitz constants may exist even when f is not Lipschitz continuous. Although it is difficult to compute actual values of Lipschitz and logarithmic Lipschitz constants in practice, they retain their theoretical importance as useful tools in nonlinear analysis.[12]

All properties listed above for linear operators are retained, provided that operations like f+g etc. are allowed on a joint domain of definition. In the special case when f is Fréchet differentiable on a convex domain D, it further holds that

L[f]=supxDf(x)andM[f]=supxDM[f(x)],

where f(x) is the Jacobian matrix of f.

An operator having either m[f]>0 or M[f]<0 is called uniformly monotone, and the uniform monotonicity theorem states that

m[f]>0L[f1]1m[f],

generalizing Properties 14 - 15 above, and quantifying the Uniform monotonicity theorem due to Browder and Minty (1963). Here, the Lipschitz constant of the inverse is defined on the image of f, while its lower logarithmic Lipschitz constant is defined on the domain of f.

Because composition is not distributive over addition for nonlinear maps, the von Neumann type bounds for 'rational functions of operators' have no direct counterparts in the nonlinear case. Nevertheless some results carry over. For example, in a fixed point iteration xk+1=g(xk) converging to a fixed point x, it can be shown that the error is bounded by

xk+1xL[g(Ig)1]xk+1xk,

where the difference between the two most recent iterates is directly computable, and where any one of the three bounds

L[g(Ig)1]L[g]1M[g]L[g]1L[g],

can be used to obtain an error estimate. In order to guarantee the simple estimate xk+1xxk+1xk using the third expression, one needs L[g]1/2 to ensure that the estimate is satisfied. If instead the central option is used, it is sufficient that L[g]<1 and M[g]0. Using the first (best) expression directly, a sharper result holds, but only for inner product norms, viz.,

M2[g]12L2[g(Ig)1]1,

without further restrictions on L2[g]. In other words, even slowly convergent iterations admit a simple error estimate, provided that the logarithmic Lipschitz constant of g is less than 1/2. The bound relies on using the polarization identity, in the linear as well as the nonlinear case, and therefore only holds for the Euclidean norm.

For more complicated "rational functions" of f there are only limited results. Consider a nonlinear system x˙=f(x) with M2[f]0, discretized by the trapezoidal rule and the implicit midpoint method, using the same time step h. Both are A-stable Runge-Kutta methods, associated with the same rational stability function R(z)=(1+z/2)/(1z/2), so if fA is a linear map, both methods are contractive (see above). In the nonlinear case, however, the trapezoidal rule yields the recursion

un+1=(Ihf2)1(I+hf2)(un)

while the implicit midpoint method is

vn+1=(I+hf2)(Ihf2)1(vn).

Because the two nonlinear operators do not commute, only the second recursion is contractive under the assumption M2[hf]0, i.e.,

M2[f]0L2[(I+hf2)(Ihf2)1]1.

Thus A-stability is not sufficient. Instead a qualified condition, known as B-stability, is required for contractivity in the Euclidean norm[13]; the results again rely on the polarization identity. Of the two methods above, only the implicit midpoint method is B-stable, but there are other, advanced B-stable Runge--Kutta methods of high order of convergence.[14][15]

For nonlinear maps boundedness and continuity are not equivalent. While the Lipschitzian approach to logarithmic norms above emphasizes continuity and stability, there is also an alternative, weaker approach, primarily linked to coercivity. Thus, for maps F satisfying F(0)=0, defined on a star-shaped domain D containing the origin, one can define an operator norm F=supxDF(x)/x, together with an upper logarithmic norm

μ^[F]=limh0+I+hF1h.

Likewise, in inner product spaces one may define lower and upper logarithmic norms by[1]

μˇ2[F]x22x,F(x)μ^2[F]x22.

Differential inequalities

In dynamical systems it is of interest to estimate how far apart two solutions drift, given various types of perturbations. This concerns bounding the difference uv, where

u˙=f(u)+r
v˙=f(v).

From the differential inequality

Dt+uvM[f]uv+r,t0

the following general bound is obtained using logarithmic Lipschitz constants,

u(t)v(t)etM[f]u0v0+0te(tτ)M[f]r(τ)dτ.

This bound applies to linear systems, non-autonomous systems, nonlinear systems as well as some problems with where f represents an unbounded operator, and for any preferred choice of norm. Taking r0 covers perturbed initial values, while taking u0=v0 covers the effect of forcing functions. In the latter case one can estimate

u(t)v(t)etM[f]1M[f]r,

where r=maxt0r(t). The bound generally holds on compact intervals, but if M[f]<0 it also holds on infinite intervals, showing that the problem is well posed, and that the error uv is uniformly bounded in terms of r for all t0, i.e.,

u(t)v(t)rM[f].

In case of a stiff differential equation u˙=f(u) on [0,T] it holds that

eTm[f]eTM[f],

where the expression on the left represents the maximal growth in reverse time over a time interval of length T, while the right expression shows the maximal growth in forward time. Hence the condition

s[f]:=m[f]+M[f]21T

is characteristic of stiffness: the average of the logarithmic Lipschitz constants is negative, and typically several orders of magnitude larger than 1/T. Here the functional s[f] is referred to as the stiffness indicator.[16]

By instead using the logarithmic norms of the Jacobian matrix f(u), the criterion can be given a local interpretation along the solution trajectory. If an explicit method is used to integrate the equation, then the time step h will be restricted by numerical stability requirements to be of order h1/|s[f(u)]|.

Note that

s[f(u)]=μ1+μn2,

where μ1 and μn are the largest and smallest logarithmic values (see above) of the Jacobian f(u). Because it is relatively expensive to compute the logarithmic norms, an alternative of similar utility is to compute the average of all logarithmic values. The scaled trace τ[] is defined by

τ[f(u)]:=1nkμk=1ntrace[f(u)]=1n(divf)(u).

In practice, τ[f(u)] serves well as a stiffness indicator, and is cheap to compute as it is simply the average of the diagonal elements of f(u). Moreover, it equals the scaled divergence of the vector field at u,

1n(divf)(u).

A dynamical system with divf0 preserves phase volume (often referred to as area preserving). What is characteristic of stiff differential equations is that divf is large and negative, implying that phase volume is compressed along the solution. The phase volume compression rate τ[f(u)] (per unit time) is indicative of stiffness at u, and quantifies step size restrictions along the trajectory for explicit methods.[17]

Elliptic differential operators

A differential operator satisfying u,u>0 is called elliptic. In particular, the logarithmic norm quantifies the concept of strong ellipticity. Applications are found in boundary value problems and partial differential equations. In the simplest case one considers piecewise continuously differentiable functions u,v satisfying homogeneous Dirichlet conditions u(0)=u(1)=0, with the inner product

u,v=01uvdx

inducing the usual L2 norm on the function space H01(0,1). Letting u denote du/dx, integration by parts yields u,v=u,v. Hence, for the operator =d2/dx2 we obtain

u,u=u,u=u,u=u22π2u22,

where the last step is the Poincaré inequality. Equality is attained for the function u=sinπx, implying that π2 is the best possible constant. Thus

u,um2[]u22π2u22,

showing that is strongly elliptic, with m2[d2/dx2]=π2>0 quantifying the ellipticity. (The upper logarithmic norm is +, since the operator is unbounded.) An analogous derivation for Neumann conditions u(0)=u(1)=0 results in m2[]=π2/4, where the free boundary condition at x=1 reduces ellipticity compared with the case of Dirichlet conditions.

Considering the 1D Poisson equation u=f with fL2(0,1) and homogeneous Dirichlet conditions, Property 14 above offers the solution bound

u2f2m2[]=f2π2.

Hence the solution u is unique and depends continuously on the data, implying stability, where the inverse of the lower logarithmic norm is the stability constant. A smaller stability constant implies that the solution is more sensitive to perturbations in the data function f.

Similar results hold for finite element approximations, where the function space is replaced by (say) piecewise linear functions defined on a grid. This affects the logarithmic norm of , which becomes somewhat larger (since the minimizer in the Poincaré inequality cannot be represented). The theory can be extended to Sobolev norms and weak solutions, e.g. allowing "point loads" (Dirac functions in the right hand side), as is common in Galerkin methods, using different norms in the solution and data and spaces as needed.

If instead a finite difference method is used to solve u=f, the differential equation is approximated by an algebraic equation Lu~=f~. The matrix L will typically inherit the ellipticity, i.e., m2[L]>0, showing that L is positive definite and therefore invertible.

The importance of strong ellipticity in selfadjoint differential operators can be illustrated by Euler buckling, where a slender column, fixed by articulated boundary conditions (homogeneous Dirichlet conditions as above), is subjected to an axial load F. The solution u represents the deflection from the column's center line, and satisfies the eigenvalue problem

uFEIu=0,

where E is the elasticity module of the material and I is the cross-section moment of inertia of the column. The logarithmic norm of the associated operator is

m2[d2dx2FEI]=π2FEI.

As long as F/(EI)<π2, the operator remains strongly elliptic, and u0 is the unique solution. However, at the critical load Fc/(EI)=π2 ellipticity is lost: the column fails (buckles) when FFc. The zero solution is then no longer unique, in agreement with the Fredholm alternative, as the problem also admits nonzero (buckled) solutions satisfying the homogeneous boundary conditions.

The logarithmic norm can also be used in initial-boundary value partial differential equations ut=𝒜u+f(u), where it quantifies the growth or decay rates of solutions. This is primarily of use in dissipative systems. The difference between two solutions decays if

M2[𝒜]+M2[f]<0,

in which case the problem is well posed. This will typically require 𝒜 to be strongly elliptic and to dominate M2[f] for sufficient dissipation, such as in a reaction-diffusion equation.

For a more complicated differential operator, such as in convection-diffusion, M2[𝒜] may depend in a nontrivial way on the type of boundary conditions and the balance of convection and diffusion rates (the Péclet number), leading to questions of well posedness. For problems in several space dimensions, it is usually difficult to calculate the logarithmic norm, as it depends on the geometry of the computational domain where the boundary conditions are imposed.

References

  1. 1.0 1.1 1.2 1.3 1.4 Söderlind, G. (2024). "Logarithmic Norms." Springer Series in Computational Mathematics vol. 63. ISBN 978-3-031-74378-8 https://doi.org/10.1007/978-3-031-74379-5
  2. Bullo, F. (2022). "Contraction Theory for Dynamical Systems". 1st ed., (19 Aug. 2022). Kindle Direct Publishing ISBN 979-8836646806
  3. Dahlquist, G. (1958). "Stability and error bounds in the numerical integration of ordinary differential equations", Almqvist & Wiksell, Uppsala 1958
  4. Lozinskiĭ, S. (1958). "Error estimates for the numerical integration of ordinary differential equations, Part I". (In Russian) Izv. Vyss. Uceb. Zaved. Matematika 6, pp 52-90 (1958)
  5. Bullo, F. (2022). "Contraction Theory for Dynamical Systems". 1st ed., (19 Aug. 2022). Kindle Direct Publishing ISBN 979-8836646806
  6. Desoer, C.; Haneda, H. (1972). "The measure of a matrix as a tool to analyze computer algorithms for circuit analysis". IEEE Transactions on Circuit Theory 19 (5): 480–486. doi:10.1109/tct.1972.1083507. 
  7. Desoer, C. A.; Vidyasagar, M. (1975). Feedback Systems: Input-output Properties. New York: Elsevier. p. 34. ISBN 9780323157797. 
  8. Söderlind, G. (1986). "Bounds on nonlinear operators in finite-dimensional Banach spaces". Numerical Mathematics 50, pp 27-44.
  9. Söderlind, G. (2006). "The logarithmic norm. History and modern theory", BIT Numerical Mathematics, 46(3):631-652.
  10. Desoer, C.; Haneda, H. (1972). "The measure of a matrix as a tool to analyze computer algorithms for circuit analysis". IEEE Transactions on Circuit Theory 19 (5): 480–486. doi:10.1109/tct.1972.1083507. 
  11. Dekker, K.; Verwer, J.G. (1984). "Stability of Runge-Kutta methods for stiff nonlinear differential equations." New York: North Holland.}
  12. Söderlind, G. (1986). "Bounds on nonlinear operators in finite-dimensional Banach spaces". Numerical Mathematics 50, pp 27-44.
  13. Butcher, J.C. (1975): "A stability property of implicit Runge--Kutta methods". BIT 15: 358-361.
  14. Butcher, J.C. (2008) "Numerical methods for ordinary differential equations", 2nd ed. New York: Wiley
  15. Hairer, E.; Wanner, G. (1991). "Solving orindary differential equations II. Stiff and differential-algebraic problems." Berlin-Heidelberg-New York: Springer.
  16. Söderlind, G.; Jay, L.O.; Calvo, M. (2015) "Stiffness 1952--2012: Sixty years in search of a definition." BIT 55: 531-558.
  17. Söderlind, G. (2024) "Logarithmic Norms." Springer Series in Computational Mathematics vol. 63. pp 281-286. ISBN 978-3-031-74378-8 https://doi.org/10.1007/978-3-031-74379-5