Definiteness of a matrix
In linear algebra, a symmetric [math]\displaystyle{ n \times n }[/math] real matrix [math]\displaystyle{ M }[/math] is said to be positive definite if the scalar [math]\displaystyle{ z^\textsf{T}Mz }[/math] is strictly positive for every non-zero column vector [math]\displaystyle{ z }[/math] of [math]\displaystyle{ n }[/math] real numbers. Here [math]\displaystyle{ z^\textsf{T} }[/math] denotes the transpose of [math]\displaystyle{ z }[/math].[1] When interpreting [math]\displaystyle{ Mz }[/math] as the output of an operator, [math]\displaystyle{ M }[/math], that is acting on an input, [math]\displaystyle{ z }[/math], the property of positive definiteness implies that the output always has a positive inner product with the input, as often observed in physical processes.
More generally, a complex [math]\displaystyle{ n\times n }[/math] Hermitian matrix [math]\displaystyle{ M }[/math] is said to be positive definite if the scalar [math]\displaystyle{ z^* Mz }[/math] is strictly positive for every non-zero column vector [math]\displaystyle{ z }[/math] of [math]\displaystyle{ n }[/math] complex numbers. Here [math]\displaystyle{ z^* }[/math] denotes the conjugate transpose of [math]\displaystyle{ z }[/math]. Note that [math]\displaystyle{ z^* Mz }[/math] is automatically real since [math]\displaystyle{ M }[/math] is Hermitian.
Positive semi-definite matrices are defined similarly, except that the above scalars [math]\displaystyle{ z^\textsf{T}Mz }[/math] or [math]\displaystyle{ z^* Mz }[/math] must be positive or zero (i.e. non-negative). Negative definite and negative semi-definite matrices are defined analogously. A matrix that is not positive semi-definite and not negative semi-definite is called indefinite.
The matrix [math]\displaystyle{ M }[/math] is positive definite if and only if the bilinear form [math]\displaystyle{ \langle z, w\rangle = z^\textsf{T} Mw }[/math] is positive definite (and similarly for a positive definite sesquilinear form in the complex case). This is a coordinate realization of an inner product on a vector space.[2]
Some authors use more general definitions of definiteness, including some non-symmetric real matrices, or non-Hermitian complex ones.
Definitions
In the following definitions, [math]\displaystyle{ x^\textsf{T} }[/math] is the transpose of [math]\displaystyle{ x }[/math], [math]\displaystyle{ x^* }[/math] is the conjugate transpose of [math]\displaystyle{ x }[/math] and [math]\displaystyle{ \mathbf{0} }[/math] denotes the n-dimensional zero-vector.
Definitions for real matrices
A [math]\displaystyle{ n \times n }[/math] symmetric real matrix [math]\displaystyle{ M }[/math] is said to be positive definite if [math]\displaystyle{ x^\textsf{T} Mx \gt 0 }[/math] for all non-zero [math]\displaystyle{ x }[/math] in [math]\displaystyle{ \mathbb{R}^n }[/math]. Formally,
[math]\displaystyle{ M \text{ positive definite} \quad \iff \quad x^\textsf{T} Mx \gt 0 \text{ for all } x \in \mathbb{R}^n \setminus \mathbf{0} }[/math]
A [math]\displaystyle{ n \times n }[/math] symmetric real matrix [math]\displaystyle{ M }[/math] is said to be positive semidefinite or non-negative definite if [math]\displaystyle{ x^\textsf{T} Mx \geq 0 }[/math] for all [math]\displaystyle{ x }[/math] in [math]\displaystyle{ \mathbb{R}^n }[/math]. Formally,
[math]\displaystyle{ M \text{ positive semi-definite} \quad \iff \quad x^\textsf{T} Mx \geq 0 \text{ for all } x \in \mathbb{R}^n }[/math]
A [math]\displaystyle{ n \times n }[/math] symmetric real matrix [math]\displaystyle{ M }[/math] is said to be negative definite if [math]\displaystyle{ x^\textsf{T} Mx \lt 0 }[/math] for all non-zero [math]\displaystyle{ x }[/math] in [math]\displaystyle{ \mathbb{R}^n }[/math]. Formally,
[math]\displaystyle{ M \text{ negative definite} \quad \iff \quad x^\textsf{T} Mx \lt 0 \text{ for all } x \in \mathbb{R}^n \setminus \mathbf{0} }[/math]
A [math]\displaystyle{ n \times n }[/math] symmetric real matrix [math]\displaystyle{ M }[/math] is said to be negative semidefinite or non-positive definite if [math]\displaystyle{ x^\textsf{T} Mx \leq 0 }[/math] for all [math]\displaystyle{ x }[/math] in [math]\displaystyle{ \mathbb{R}^n }[/math]. Formally,
[math]\displaystyle{ M \text{ negative semi-definite} \quad \iff \quad x^\textsf{T} Mx \leq 0 \text{ for all } x \in \mathbb{R}^n }[/math]
A [math]\displaystyle{ n \times n }[/math] symmetric real matrix which is neither positive semidefinite nor negative semidefinite is called indefinite.
Definitions for complex matrices
The following definitions all involve the term [math]\displaystyle{ x^* Mx }[/math]. Notice that this is always a real number for any Hermitian square matrix [math]\displaystyle{ M }[/math].
A [math]\displaystyle{ n \times n }[/math] Hermitian complex matrix [math]\displaystyle{ M }[/math] is said to be positive definite if [math]\displaystyle{ x^* Mx \gt 0 }[/math] for all non-zero [math]\displaystyle{ x }[/math] in [math]\displaystyle{ \mathbb{C}^n }[/math]. Formally,
[math]\displaystyle{ M \text{ positive definite} \quad \iff \quad x^* Mx \gt 0 \text{ for all } x \in \mathbb{C}^n \setminus \mathbf{0} }[/math]
A [math]\displaystyle{ n \times n }[/math] Hermitian complex matrix [math]\displaystyle{ M }[/math] is said to be positive semi-definite or non-negative definite if [math]\displaystyle{ x^* Mx \geq 0 }[/math] for all [math]\displaystyle{ x }[/math] in [math]\displaystyle{ \mathbb{C}^n }[/math]. Formally,
[math]\displaystyle{ M \text{ positive semi-definite} \quad \iff \quad x^* Mx \geq 0 \text{ for all } x \in \mathbb{C}^n }[/math]
A [math]\displaystyle{ n \times n }[/math] Hermitian complex matrix [math]\displaystyle{ M }[/math] is said to be negative definite if [math]\displaystyle{ x^* Mx \lt 0 }[/math] for all non-zero [math]\displaystyle{ x }[/math] in [math]\displaystyle{ \mathbb{C}^n }[/math]. Formally,
[math]\displaystyle{ M \text{ negative definite} \quad \iff \quad x^* Mx \lt 0 \text{ for all } x \in \mathbb{C}^n \setminus \mathbf{0} }[/math]
A [math]\displaystyle{ n \times n }[/math] Hermitian complex matrix [math]\displaystyle{ M }[/math] is said to be negative semi-definite or non-positive definite if [math]\displaystyle{ x^* Mx \leq 0 }[/math] for all [math]\displaystyle{ x }[/math] in [math]\displaystyle{ \mathbb{C}^n }[/math]. Formally,
[math]\displaystyle{ M \text{ negative semi-definite} \quad \iff \quad x^* Mx \leq 0 \text{ for all } x \in \mathbb{C}^n }[/math]
A [math]\displaystyle{ n \times n }[/math] Hermitian complex matrix which is neither positive semidefinite nor negative semidefinite is called indefinite.
Consistency between real and complex definitions
Since every real matrix is also a complex matrix, the definitions of "definiteness" for the two classes must agree.
For complex matrices, the most common definition says that "[math]\displaystyle{ M }[/math] is positive definite if and only if [math]\displaystyle{ z^* Mz }[/math] is real and positive for all non-zero complex column vectors [math]\displaystyle{ z }[/math]". This condition implies that [math]\displaystyle{ M }[/math] is Hermitian (i.e. its transpose is equal to its conjugate). To see this, consider the matrices [math]\displaystyle{ A = \tfrac{1}{2} \left(M + M^*\right) }[/math] and [math]\displaystyle{ B = \tfrac{1}{2i} \left(M - M^*\right) }[/math], so that [math]\displaystyle{ M = A + iB }[/math] and [math]\displaystyle{ z^* Mz = z^* Az + iz^* Bz }[/math]. The matrices [math]\displaystyle{ A }[/math] and [math]\displaystyle{ B }[/math] are Hermitian, therefore [math]\displaystyle{ z^* Az }[/math] and [math]\displaystyle{ z^* Bz }[/math] are individually real. If [math]\displaystyle{ z^* Mz }[/math] is real, then [math]\displaystyle{ z^* Bz }[/math] must be zero for all [math]\displaystyle{ z }[/math]. Then [math]\displaystyle{ B }[/math] is the zero matrix and [math]\displaystyle{ M = A }[/math], proving that [math]\displaystyle{ M }[/math] is Hermitian.
By this definition, a positive definite real matrix [math]\displaystyle{ M }[/math] is Hermitian, hence symmetric; and [math]\displaystyle{ z^\textsf{T} Mz }[/math] is positive for all non-zero real column vectors [math]\displaystyle{ z }[/math]. However the last condition alone is not sufficient for [math]\displaystyle{ M }[/math] to be positive definite. For example, if
- [math]\displaystyle{ M = \begin{bmatrix} 1 & 1 \\ -1 & 1 \end{bmatrix}, }[/math]
then for any real vector [math]\displaystyle{ z }[/math] with entries [math]\displaystyle{ a }[/math] and [math]\displaystyle{ b }[/math] we have [math]\displaystyle{ z^\textsf{T} Mz = (a - b)a + (a + b)b = a^2 + b^2 }[/math], which is always positive if [math]\displaystyle{ z }[/math] is not zero. However, if [math]\displaystyle{ z }[/math] is the complex vector with entries [math]\displaystyle{ 1 }[/math] and [math]\displaystyle{ i }[/math], one gets
- [math]\displaystyle{ z^* M z = [1, -i] M [1, i]^\textsf{T} = [1 + i, 1 - i] [1, i]^\textsf{T} = 2+2i }[/math]
which is not real. Therefore, [math]\displaystyle{ M }[/math] is not positive definite.
On the other hand, for a symmetric real matrix [math]\displaystyle{ M }[/math], the condition "[math]\displaystyle{ z^\textsf{T} Mz \gt 0 }[/math] for all nonzero real vectors [math]\displaystyle{ z }[/math]" does imply that [math]\displaystyle{ M }[/math] is positive definite in the complex sense.
Notation
If a Hermitian matrix [math]\displaystyle{ M }[/math] is positive semi-definite, one sometimes writes [math]\displaystyle{ M \succeq 0 }[/math] and if [math]\displaystyle{ M }[/math] is positive definite one writes [math]\displaystyle{ M \succ 0 }[/math]. To denote that [math]\displaystyle{ M }[/math] is negative semi-definite one writes [math]\displaystyle{ M \preceq 0 }[/math] and to denote that [math]\displaystyle{ M }[/math] is negative definite one writes [math]\displaystyle{ M \prec 0 }[/math].
The notion comes from functional analysis where positive semidefinite matrices define positive operators.
A common alternative notation is [math]\displaystyle{ M \geq 0 }[/math], [math]\displaystyle{ M \gt 0 }[/math], [math]\displaystyle{ M \leq 0 }[/math] and [math]\displaystyle{ M \lt 0 }[/math] for positive semi-definite and positive definite, negative semi-definite and negative definite matrices, respectively. This may be confusing, as sometimes nonnegative matrices respectively nonpositive matrices are also denoted in this way.
Examples
- The identity matrix [math]\displaystyle{ I = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} }[/math] is positive definite (and as such also positive semi-definite). It is a real symmetric matrix, and, for any non-zero column vector z with real entries a and b, one has
- [math]\displaystyle{ z^\textsf{T}Iz = \begin{bmatrix} a & b \end{bmatrix} \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} \begin{bmatrix} a \\ b \end{bmatrix} = a^2 + b^2 }[/math].
Seen as a complex matrix, for any non-zero column vector z with complex entries a and b one has
- [math]\displaystyle{ z^*Iz = \begin{bmatrix} \overline{a} & \overline{b} \end{bmatrix} \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} \begin{bmatrix} a \\ b\end{bmatrix} = \overline{a}a + \overline{b}b = |a|^2 + |b|^2 }[/math].
- The real symmetric matrix
- [math]\displaystyle{ M = \begin{bmatrix} 2 & -1 & 0 \\ -1 & 2 & -1 \\ 0 & -1 & 2 \end{bmatrix} }[/math]
- [math]\displaystyle{ \begin{align} z^\textsf{T} Mz = \left(z^\textsf{T}M\right) z &= \begin{bmatrix} (2a - b) & (-a + 2b - c) & (-b + 2c) \end{bmatrix} \begin{bmatrix} a \\ b \\ c \end{bmatrix} \\ &= (2a - b)a + (-a + 2b - c)b + (-b + 2c)c \\ &= 2a^2 - ba - ab + 2b^2 - cb - bc + 2c^2 \\ &= 2a^2 - 2ab + 2b^2 - 2bc + 2c^2 \\ &= a^2 + a^2 - 2ab + b^2 + b^2- 2bc + c^2 + c^2 \\ &= a^2 + (a - b)^2 + (b - c)^2 + c^2 \end{align} }[/math]
- For any real invertible matrix [math]\displaystyle{ A }[/math], the product [math]\displaystyle{ A^\textsf{T}A }[/math] is a positive definite matrix. A simple proof is that for any non-zero vector [math]\displaystyle{ z }[/math], the condition [math]\displaystyle{ z^\textsf{T} A^\textsf{T} Az = (Az)^\textsf{T}(Az) = \|Az\|^2 \gt 0, }[/math] since the invertibility of matrix [math]\displaystyle{ A }[/math] means that [math]\displaystyle{ Az \neq 0. }[/math]
- The example [math]\displaystyle{ M }[/math] above shows that a matrix in which some elements are negative may still be positive definite. Conversely, a matrix whose entries are all positive is not necessarily positive definite, as for example
- [math]\displaystyle{ N = \begin{bmatrix} 1 & 2 \\ 2 & 1 \end{bmatrix}, }[/math]
Eigenvalues
Let [math]\displaystyle{ M }[/math] be an [math]\displaystyle{ n \times n }[/math] Hermitian matrix.
- [math]\displaystyle{ M }[/math] is positive definite if and only if all of its eigenvalues are positive.
- [math]\displaystyle{ M }[/math] is positive semi-definite if and only if all of its eigenvalues are non-negative.
- [math]\displaystyle{ M }[/math] is negative definite if and only if all of its eigenvalues are negative
- [math]\displaystyle{ M }[/math] is negative semi-definite if and only if all of its eigenvalues are non-positive.
- [math]\displaystyle{ M }[/math] is indefinite if and only if it has both positive and negative eigenvalues.
Let [math]\displaystyle{ P^{-1} DP }[/math] be an eigendecomposition of [math]\displaystyle{ M }[/math], where [math]\displaystyle{ P }[/math] is a unitary complex matrix whose rows comprise an orthonormal basis of eigenvectors of [math]\displaystyle{ M }[/math], and [math]\displaystyle{ D }[/math] is a real diagonal matrix whose main diagonal contains the corresponding eigenvalues. The matrix [math]\displaystyle{ M }[/math] may be regarded as a diagonal matrix [math]\displaystyle{ D }[/math] that has been re-expressed in coordinates of the basis [math]\displaystyle{ P }[/math]. In particular, the one-to-one change of variable [math]\displaystyle{ y = Pz }[/math] shows that [math]\displaystyle{ z^* Mz }[/math] is real and positive for any complex vector [math]\displaystyle{ z }[/math] if and only if [math]\displaystyle{ y^* Dy }[/math] is real and positive for any [math]\displaystyle{ y }[/math]; in other words, if [math]\displaystyle{ D }[/math] is positive definite. For a diagonal matrix, this is true only if each element of the main diagonal—that is, every eigenvalue of [math]\displaystyle{ M }[/math]—is positive. Since the spectral theorem guarantees all eigenvalues of a Hermitian matrix to be real, the positivity of eigenvalues can be checked using Descartes' rule of alternating signs when the characteristic polynomial of a real, symmetric matrix [math]\displaystyle{ M }[/math] is available.
Connections
A general purely quadratic real function [math]\displaystyle{ f(\mathbf{x}) }[/math] on [math]\displaystyle{ n }[/math] real variables [math]\displaystyle{ x_1, \ldots, x_n }[/math] can always be written as [math]\displaystyle{ \mathbf{x}^\textsf{T} M \mathbf{x} }[/math] where [math]\displaystyle{ \mathbf{x} }[/math] is the column vector with those variables, and [math]\displaystyle{ M }[/math] is a symmetric real matrix. Therefore, the matrix being positive definite means that [math]\displaystyle{ f }[/math] has a unique minimum (zero) when [math]\displaystyle{ \mathbf{x} }[/math] is zero, and is strictly positive for any other [math]\displaystyle{ \mathbf{x} }[/math].
More generally, a twice-differentiable real function [math]\displaystyle{ f }[/math] on [math]\displaystyle{ n }[/math] real variables has local minimum at arguments [math]\displaystyle{ x_1, \ldots, x_n }[/math] if its gradient is zero and its Hessian (the matrix of all second derivatives) is positive semi-definite at that point. Similar statements can be made for negative definite and semi-definite matrices.
In statistics, the covariance matrix of a multivariate probability distribution is always positive semi-definite; and it is positive definite unless one variable is an exact linear function of the others. Conversely, every positive semi-definite matrix is the covariance matrix of some multivariate distribution.
Characterizations
Let [math]\displaystyle{ M }[/math] be an [math]\displaystyle{ n \times n }[/math] Hermitian matrix. The following properties are equivalent to [math]\displaystyle{ M }[/math] being positive definite:
- The associated sesquilinear form is an inner product
- The sesquilinear form defined by [math]\displaystyle{ M }[/math] is the function [math]\displaystyle{ \langle \cdot, \cdot\rangle }[/math] from [math]\displaystyle{ \mathbb{C}^n \times \mathbb{C}^n }[/math] to [math]\displaystyle{ \mathbb{C}^n }[/math] such that [math]\displaystyle{ \langle x, y \rangle := y^*M x }[/math] for all [math]\displaystyle{ x }[/math] and [math]\displaystyle{ y }[/math] in [math]\displaystyle{ \mathbb{C}^n }[/math], where [math]\displaystyle{ y^* }[/math] is the conjugate transpose of [math]\displaystyle{ y }[/math]. For any complex matrix [math]\displaystyle{ M }[/math], this form is linear in [math]\displaystyle{ x }[/math] and semilinear in [math]\displaystyle{ y }[/math]. Therefore, the form is an inner product on [math]\displaystyle{ \mathbb{C}^n }[/math] if and only if [math]\displaystyle{ \langle z, z \rangle }[/math] is real and positive for all nonzero [math]\displaystyle{ z }[/math]; that is if and only if [math]\displaystyle{ M }[/math] is positive definite. (In fact, every inner product on [math]\displaystyle{ \mathbb{C}^n }[/math] arises in this fashion from a Hermitian positive definite matrix.)
- It is the Gram matrix of a set of linearly independent vectors
- Let [math]\displaystyle{ x_1, \ldots, x_n }[/math] be a list of [math]\displaystyle{ n }[/math] linearly independent vectors of some complex vector space with an inner product [math]\displaystyle{ \langle \cdot, \cdot \rangle }[/math]. It can be verified that the Gram matrix [math]\displaystyle{ M }[/math] of those vectors, defined by [math]\displaystyle{ M_{ij} = \langle x_i, x_j \rangle }[/math], is always positive definite. Conversely, if [math]\displaystyle{ M }[/math] is positive definite, it has an eigendecomposition [math]\displaystyle{ P^{-1} DP }[/math] where [math]\displaystyle{ P }[/math] is unitary, [math]\displaystyle{ D }[/math] diagonal, and all diagonal elements [math]\displaystyle{ D_{ii} = \lambda_i }[/math] of [math]\displaystyle{ D }[/math] are real and positive. Let [math]\displaystyle{ E }[/math] be the real diagonal matrix with entries [math]\displaystyle{ E_{ii} = \sqrt{\lambda_i} }[/math] so [math]\displaystyle{ E^2 = D }[/math]; then [math]\displaystyle{ P^{-1}DP = P^*DP = P^*E EP = (EP)^* EP }[/math]. Now we let [math]\displaystyle{ x_1, \ldots, x_n }[/math] be the columns of [math]\displaystyle{ EP }[/math]. These vectors are linearly independent, and by the above [math]\displaystyle{ M }[/math] is their Gram matrix, under the standard inner product of [math]\displaystyle{ \mathbb{C}^n }[/math], namely [math]\displaystyle{ \langle x_i, x_j \rangle = x_i^\ast x_j }[/math].
- Its leading principal minors are all positive
- The kth leading principal minor of a matrix [math]\displaystyle{ M }[/math] is the determinant of its upper-left [math]\displaystyle{ k \times k }[/math] sub-matrix. It turns out that a matrix is positive definite if and only if all these determinants are positive. This condition is known as Sylvester's criterion, and provides an efficient test of positive definiteness of a symmetric real matrix. Namely, the matrix is reduced to an upper triangular matrix by using elementary row operations, as in the first part of the Gaussian elimination method, taking care to preserve the sign of its determinant during pivoting process. Since the kth leading principal minor of a triangular matrix is the product of its diagonal elements up to row [math]\displaystyle{ k }[/math], Sylvester's criterion is equivalent to checking whether its diagonal elements are all positive. This condition can be checked each time a new row [math]\displaystyle{ k }[/math] of the triangular matrix is obtained.
Quadratic forms, convexity, optimization
The (purely) quadratic form associated with a real [math]\displaystyle{ n \times n }[/math] matrix [math]\displaystyle{ M }[/math] is the function [math]\displaystyle{ Q : \mathbb{R}^n \mapsto \mathbb{R} }[/math] such that [math]\displaystyle{ Q(x) = x^\textsf{T} Mx }[/math] for all [math]\displaystyle{ x }[/math]. [math]\displaystyle{ M }[/math] can be assumed symmetric by replacing it with [math]\displaystyle{ \tfrac{1}{2} \left(M + M^\textsf{T}\right) }[/math].
A symmetric matrix [math]\displaystyle{ M }[/math] is positive definite if and only if its quadratic form is a strictly convex function.
More generally, any quadratic function from [math]\displaystyle{ \mathbb{R}^n }[/math] to [math]\displaystyle{ \mathbb{R} }[/math] can be written as [math]\displaystyle{ x^\textsf{T} Mx + x^\textsf{T} b + c }[/math] where [math]\displaystyle{ M }[/math] is a symmetric [math]\displaystyle{ n \times n }[/math] matrix, [math]\displaystyle{ b }[/math] is a real [math]\displaystyle{ n }[/math]-vector, and [math]\displaystyle{ c }[/math] a real constant. This quadratic function is strictly convex, and hence has a unique finite global minimum, if and only if [math]\displaystyle{ M }[/math] is positive definite. For this reason, positive definite matrices play an important role in optimization problems.
Simultaneous diagonalization
A symmetric matrix and another symmetric and positive definite matrix can be simultaneously diagonalized, although not necessarily via a similarity transformation. This result does not extend to the case of three or more matrices. In this section we write for the real case. Extension to the complex case is immediate.
Let [math]\displaystyle{ M }[/math] be a symmetric and [math]\displaystyle{ N }[/math] a symmetric and positive definite matrix. Write the generalized eigenvalue equation as [math]\displaystyle{ (M - \lambda N)x = 0 }[/math] where we impose that [math]\displaystyle{ x }[/math] be normalized, i.e. [math]\displaystyle{ x^\textsf{T} Nx = 1 }[/math]. Now we use Cholesky decomposition to write the inverse of [math]\displaystyle{ N }[/math] as [math]\displaystyle{ Q^\textsf{T} Q }[/math]. Multiplying by [math]\displaystyle{ Q }[/math] and letting [math]\displaystyle{ x = Q^\textsf{T} y }[/math], we get [math]\displaystyle{ Q(M - \lambda N)Q^\textsf{T} y = 0 }[/math], which can be rewritten as [math]\displaystyle{ \left(QMQ^\textsf{T}\right)y = \lambda y }[/math] where [math]\displaystyle{ y^\textsf{T} y = 1 }[/math]. Manipulation now yields [math]\displaystyle{ MX = NX\Lambda }[/math] where [math]\displaystyle{ X }[/math] is a matrix having as columns the generalized eigenvectors and [math]\displaystyle{ \Lambda }[/math] is a diagonal matrix of the generalized eigenvalues. Now premultiplication with [math]\displaystyle{ X^\textsf{T} }[/math] gives the final result: [math]\displaystyle{ X^\textsf{T} MX = \Lambda }[/math] and [math]\displaystyle{ X^\textsf{T} NX = I }[/math], but note that this is no longer an orthogonal diagonalization with respect to the inner product where [math]\displaystyle{ y^\textsf{T} y = 1 }[/math]. In fact, we diagonalized [math]\displaystyle{ M }[/math] with respect to the inner product induced by [math]\displaystyle{ N }[/math].
Note that this result does not contradict what is said on simultaneous diagonalization in the article Diagonalizable matrix, which refers to simultaneous diagonalization by a similarity transformation. Our result here is more akin to a simultaneous diagonalization of two quadratic forms, and is useful for optimization of one form under conditions on the other.[3]
Properties
Induced partial ordering
For arbitrary square matrices [math]\displaystyle{ M }[/math], [math]\displaystyle{ N }[/math] we write [math]\displaystyle{ M \ge N }[/math] if [math]\displaystyle{ M - N \ge 0 }[/math] i.e., [math]\displaystyle{ M - N }[/math] is positive semi-definite. This defines a partial ordering on the set of all square matrices. One can similarly define a strict partial ordering [math]\displaystyle{ M \gt N }[/math]. The ordering is called the Loewner order.
Inverse of positive definite matrix
Every positive definite matrix is invertible and its inverse is also positive definite.[4] If [math]\displaystyle{ M \geq N \gt 0 }[/math] then [math]\displaystyle{ N^{-1} \geq M^{-1} \gt 0 }[/math].[5] Moreover, by the min-max theorem, the kth largest eigenvalue of [math]\displaystyle{ M }[/math] is greater than the kth largest eigenvalue of [math]\displaystyle{ N }[/math].
Scaling
If [math]\displaystyle{ M }[/math] is positive definite and [math]\displaystyle{ r \gt 0 }[/math] is a real number, then [math]\displaystyle{ rM }[/math] is positive definite.[6]
Addition
If [math]\displaystyle{ M }[/math] and [math]\displaystyle{ N }[/math] are positive definite, then the sum [math]\displaystyle{ M + N }[/math] is also positive definite.[6]
Multiplication
- If [math]\displaystyle{ M }[/math] and [math]\displaystyle{ N }[/math] are positive definite, then the products [math]\displaystyle{ MNM }[/math] and [math]\displaystyle{ NMN }[/math] are also positive definite. If [math]\displaystyle{ MN = NM }[/math], then [math]\displaystyle{ MN }[/math] is also positive definite.
- If [math]\displaystyle{ M }[/math] is positive semidefinite, then [math]\displaystyle{ Q^\textsf{T} MQ }[/math] is positive semidefinite. If [math]\displaystyle{ M }[/math] is positive definite and [math]\displaystyle{ Q }[/math] has full column rank, then [math]\displaystyle{ Q^\textsf{T} MQ }[/math] is positive definite.[7]
Cholesky decomposition
For any matrix [math]\displaystyle{ A }[/math], the matrix [math]\displaystyle{ A^* A }[/math] is positive semidefinite, and [math]\displaystyle{ \operatorname{rank}(A) = \operatorname{rank}(A^* A) }[/math]. Conversely, any Hermitian positive semi-definite matrix [math]\displaystyle{ M }[/math] can be written as [math]\displaystyle{ M = LL^* }[/math], where [math]\displaystyle{ L }[/math] is lower triangular; this is the Cholesky decomposition. If [math]\displaystyle{ M }[/math] is not positive definite, then some of the diagonal elements of [math]\displaystyle{ L }[/math] may be zero.
A hermitian matrix [math]\displaystyle{ M }[/math] is positive definite if and only if it has a unique Cholesky decomposition, i.e. the matrix [math]\displaystyle{ M }[/math] is positive definite if and only if there exists a unique lower triangular matrix [math]\displaystyle{ L }[/math], with real and strictly positive diagonal elements, such that [math]\displaystyle{ M = LL^* }[/math].
Square root
A matrix [math]\displaystyle{ M }[/math] is positive semi-definite if and only if there is a positive semi-definite matrix [math]\displaystyle{ B }[/math] with [math]\displaystyle{ B^2 = M }[/math]. This matrix [math]\displaystyle{ B }[/math] is unique,[8] is called the square root of [math]\displaystyle{ M }[/math], and is denoted with [math]\displaystyle{ B = M^\frac{1}{2} }[/math] (the square root [math]\displaystyle{ B }[/math] is not to be confused with the matrix [math]\displaystyle{ L }[/math] in the Cholesky factorization [math]\displaystyle{ M = LL^* }[/math], which is also sometimes called the square root of [math]\displaystyle{ M }[/math]).
If [math]\displaystyle{ M \gt N \gt 0 }[/math] then [math]\displaystyle{ M^\frac{1}{2} \gt N^\frac{1}{2} \gt 0 }[/math].
Submatrices
Every principal submatrix of a positive definite matrix is positive definite.
Trace
The diagonal entries [math]\displaystyle{ m_{ii} }[/math] of a positive definite matrix are real and non-negative. As a consequence the trace, [math]\displaystyle{ \operatorname{tr}(M) \ge 0 }[/math]. Furthermore,[9] since every principal sub-matrix (in particular, 2-by-2) is positive definite,
- [math]\displaystyle{ \left|m_{ij}\right| \leq \sqrt{m_{ii}m_{jj}} \leq \frac{m_{ii} + m_{jj}}{2} \quad \forall i, j }[/math]
and thus
- [math]\displaystyle{ \max_{i,j} \left|m_{ij}\right| \leq \max_i\left|m_{ii}\right| }[/math]
Hadamard product
If [math]\displaystyle{ M, N \geq 0 }[/math], although [math]\displaystyle{ MN }[/math] is not necessary positive semidefinite, the Hadamard product [math]\displaystyle{ M \circ N \geq 0 }[/math] (this result is often called the Schur product theorem).[10]
Regarding the Hadamard product of two positive semidefinite matrices [math]\displaystyle{ M = (m_{ij}) \geq 0 }[/math], [math]\displaystyle{ N \geq 0 }[/math], there are two notable inequalities:
- Oppenheim's inequality: [math]\displaystyle{ \det(M \circ N) \geq \det (N) \prod\nolimits_i m_{ii}. }[/math][11]
- [math]\displaystyle{ \det(M \circ N) \geq \det(M) \det(N) }[/math].[12]
Kronecker product
If [math]\displaystyle{ M, N \geq 0 }[/math], although [math]\displaystyle{ MN }[/math] is not necessary positive semidefinite, the Kronecker product [math]\displaystyle{ M \otimes N \geq 0 }[/math].
Frobenius product
If [math]\displaystyle{ M, N \geq 0 }[/math], although [math]\displaystyle{ MN }[/math] is not necessary positive semidefinite, the Frobenius product [math]\displaystyle{ M : N \geq 0 }[/math] (Lancaster–Tismenetsky, The Theory of Matrices, p. 218).
Convexity
The set of positive semidefinite symmetric matrices is convex. That is, if [math]\displaystyle{ M }[/math] and [math]\displaystyle{ N }[/math] are positive semidefinite, then for any [math]\displaystyle{ \alpha }[/math] between 0 and 1, [math]\displaystyle{ \alpha M + (1 - \alpha) N }[/math] is also positive semidefinite. For any vector [math]\displaystyle{ x }[/math]:
- [math]\displaystyle{ x^\textsf{T} \left(\alpha M + (1 - \alpha)N\right)x = \alpha x^\textsf{T} Mx + (1 - \alpha) x^\textsf{T} Nx \geq 0. }[/math]
This property guarantees that semidefinite programming problems converge to a globally optimal solution.
Relation with cosine
The positive-definiteness of a matrix [math]\displaystyle{ A }[/math] expresses that the angle [math]\displaystyle{ \theta }[/math] between any vector [math]\displaystyle{ x }[/math] and its image [math]\displaystyle{ Ax }[/math] is always [math]\displaystyle{ -\pi / 2 \lt \theta \lt +\pi / 2 }[/math]:
[math]\displaystyle{ \cos(\theta)=\frac{x^{T}Ax}{\left\Vert x \right\Vert \left\Vert Ax \right\Vert}=\frac{\lt x,Ax\gt }{\left\Vert x \right\Vert \left\Vert Ax \right\Vert} , \theta=\theta(x,Ax)=\widehat{x,Ax}= the \ angle \ between \ x \ and \ Ax }[/math]
Further properties
- If [math]\displaystyle{ M }[/math] is a symmetric Toeplitz matrix, i.e. the entries [math]\displaystyle{ m_{ij} }[/math] are given as a function of their absolute index differences: [math]\displaystyle{ m_{ij} = h(|i-j|) }[/math], and the strict inequality
[math]\displaystyle{ \sum\nolimits_{j \neq 0} \left|h(j)\right| \lt h(0) }[/math]
holds, then [math]\displaystyle{ M }[/math] is strictly positive definite. - Let [math]\displaystyle{ M \gt 0 }[/math] and [math]\displaystyle{ N }[/math] Hermitian. If [math]\displaystyle{ MN + NM \ge 0 }[/math] (resp., [math]\displaystyle{ MN + NM \gt 0 }[/math]) then [math]\displaystyle{ N \ge 0 }[/math] (resp., [math]\displaystyle{ N \gt 0 }[/math]).[13]
- If [math]\displaystyle{ M \gt 0 }[/math] is real, then there is a [math]\displaystyle{ \delta \gt 0 }[/math] such that [math]\displaystyle{ M\gt \delta I }[/math], where [math]\displaystyle{ I }[/math] is the identity matrix.
- If [math]\displaystyle{ M_k }[/math] denotes the leading [math]\displaystyle{ k \times k }[/math] minor, [math]\displaystyle{ \det\left(M_k\right)/\det\left(M_{k-1}\right) }[/math] is the kth pivot during LU decomposition.
- A matrix is negative definite if its k-th order leading principal minor is negative when [math]\displaystyle{ k }[/math] is odd, and positive when [math]\displaystyle{ k }[/math] is even.
- A matrix [math]\displaystyle{ M }[/math] is positive semidefinite if and only if it arises as the Gram matrix of some set of vectors. In contrast to the positive definite case, these vectors need not be linearly independent.
A Hermitian matrix is positive semidefinite if and only if all of its principal minors are nonnegative. It is however not enough to consider the leading principal minors only, as is checked on the diagonal matrix with entries 0 and −1.
Block matrices
A positive [math]\displaystyle{ 2n \times 2n }[/math] matrix may also be defined by blocks:
- [math]\displaystyle{ M = \begin{bmatrix} A & B \\ C & D \end{bmatrix} }[/math]
where each block is [math]\displaystyle{ n \times n }[/math]. By applying the positivity condition, it immediately follows that [math]\displaystyle{ A }[/math] and [math]\displaystyle{ D }[/math] are hermitian, and [math]\displaystyle{ C = B^* }[/math].
We have that [math]\displaystyle{ z^* Mz \ge 0 }[/math] for all complex [math]\displaystyle{ z }[/math], and in particular for [math]\displaystyle{ z = [v, 0]^\textsf{T} }[/math]. Then
- [math]\displaystyle{ \begin{bmatrix} v^* & 0 \end{bmatrix} \begin{bmatrix} A & B \\ B^* & D \end{bmatrix} \begin{bmatrix} v \\ 0 \end{bmatrix} = v^* Av \ge 0. }[/math]
A similar argument can be applied to [math]\displaystyle{ D }[/math], and thus we conclude that both [math]\displaystyle{ A }[/math] and [math]\displaystyle{ D }[/math] must be positive definite matrices, as well.
Converse results can be proved with stronger conditions on the blocks, for instance using the Schur complement.
Extension for non-Hermitian square matrices
The definition of positive definite can be generalized by designating any complex matrix [math]\displaystyle{ M }[/math] (e.g. real non-symmetric) as positive definite if [math]\displaystyle{ \Re\left(z^* Mz\right) \gt 0 }[/math] for all non-zero complex vectors [math]\displaystyle{ z }[/math], where [math]\displaystyle{ \Re(c) }[/math] denotes the real part of a complex number [math]\displaystyle{ c }[/math].[14] Only the Hermitian part [math]\displaystyle{ \tfrac{1}{2}\left(M + M^*\right) }[/math] determines whether the matrix is positive definite, and is assessed in the narrower sense above. Similarly, If [math]\displaystyle{ x }[/math] and [math]\displaystyle{ M }[/math] are real, we have [math]\displaystyle{ x^\textsf{T} M x \gt 0 }[/math] for all real nonzero vectors [math]\displaystyle{ x }[/math] if and only if the symmetric part [math]\displaystyle{ \tfrac{1}{2}\left(M + M^\textsf{T}\right) }[/math] is positive definite in the narrower sense. It is immediately clear that [math]\displaystyle{ x^\textsf{T} M x=x_iM_{ij}x_j }[/math]is insensitive to transposition of M.
Consequently, a non-symmetric real matrix with only positive eigenvalues does not need to be positive definite. For example, the matrix [math]\displaystyle{ M = \left[\begin{smallmatrix} 4 & 9 \\ 1 & 4 \end{smallmatrix}\right] }[/math] has positive eigenvalues yet is not positive definite; in particular a negative value of [math]\displaystyle{ x^\textsf{T} Mx }[/math] is obtained with the choice [math]\displaystyle{ x = \left[\begin{smallmatrix} -1 \\ 1 \end{smallmatrix}\right] }[/math] (which is the eigenvector associated with the negative eigenvalue of the symmetric part of [math]\displaystyle{ M }[/math]).
In summary, the distinguishing feature between the real and complex case is that, a bounded positive operator on a complex Hilbert space is necessarily Hermitian, or self adjoint. The general claim can be argued using the polarization identity. That is no longer true in the real case.
Applications
Heat conductivity matrix
Fourier's law of heat conduction, giving heat flux [math]\displaystyle{ q }[/math] in terms of the temperature gradient [math]\displaystyle{ g = \nabla T }[/math] is written for anisotropic media as [math]\displaystyle{ q = -Kg }[/math], in which [math]\displaystyle{ K }[/math] is the symmetric thermal conductivity matrix. The negative is inserted in Fourier's law to reflect the expectation that heat will always flow from hot to cold. In other words, since the temperature gradient [math]\displaystyle{ g }[/math] always points from cold to hot, the heat flux [math]\displaystyle{ q }[/math] is expected to have a negative inner product with [math]\displaystyle{ g }[/math] so that [math]\displaystyle{ q^\textsf{T}g \lt 0 }[/math]. Substituting Fourier's law then gives this expectation as [math]\displaystyle{ g^\textsf{T}Kg \gt 0 }[/math], implying that the conductivity matrix should be positive definite.
See also
- Cholesky decomposition
- Covariance matrix
- M-matrix
- Positive-definite function
- Positive-definite kernel
- Schur complement
- Square root of a matrix
- Sylvester's criterion
- Symmetric matrix
- Numerical range
Notes
- ↑ "Appendix C: Positive Semidefinite and Positive Definite Matrices". Parameter Estimation for Scientists and Engineers: 259–263. doi:10.1002/9780470173862.app3.
- ↑ Stewart, J. (1976). "Positive definite functions and generalizations, an historical survey". Rocky Mountain J. Math. 6 (3): 409–434. doi:10.1216/RMJ-1976-6-3-409.
- ↑ (Horn Johnson), p. 218 ff.
- ↑ (Horn Johnson), p. 397
- ↑ (Horn Johnson), Corollary 7.7.4(a)
- ↑ 6.0 6.1 (Horn Johnson), Observation 7.1.3
- ↑
Horn, Roger A.; Johnson, Charles R. (2013). "7.1 Definitions and Properties". Matrix Analysis (2nd ed.). Cambridge University Press. p. 431. ISBN 978-0-521-83940-2. "Observation 7.1.8 Let [math]\displaystyle{ A \in M_n }[/math] be Hermitian and let [math]\displaystyle{ C \in M_{n,m} }[/math]:
- Suppose that A is positive semidefinite. Then [math]\displaystyle{ C^* AC }[/math] is positive semidefinite, [math]\displaystyle{ \operatorname{nullspace}C^* AC) = \operatorname{nullspace}(AC) }[/math], and [math]\displaystyle{ \operatorname{rank}(C^* AC) = \operatorname{rank}(AC)^* }[/math]
- Suppose that A is positive definite. Then [math]\displaystyle{ \operatorname{rank}(C^* AC) = \operatorname{rank}(C) }[/math], and [math]\displaystyle{ C^* AC }[/math] is positive definite if and only if rank(C) = m"
- ↑ (Horn Johnson), Theorem 7.2.6 with [math]\displaystyle{ k = 2 }[/math]
- ↑ (Horn Johnson), p. 398
- ↑ (Horn Johnson), Theorem 7.5.3
- ↑ (Horn Johnson), Theorem 7.8.6
- ↑ Styan, G. P. (1973). "Hadamard products and multivariate statistical analysis". Linear Algebra and Its Applications 6: 217–240., Corollary 3.6, p. 227
- ↑ Bhatia, Rajendra (2007). Positive Definite Matrices. Princeton, New Jersey: Princeton University Press. pp. 8. ISBN 978-0-691-12918-1.
- ↑ Weisstein, Eric W. Positive Definite Matrix. From MathWorld--A Wolfram Web Resource. Accessed on 2012-07-26
References
- Horn, Roger A.; Johnson, Charles R. (1990). Matrix Analysis. Cambridge University Press. ISBN 978-0-521-38632-6.
- Bhatia, Rajendra (2007). Positive definite matrices. Princeton Series in Applied Mathematics. ISBN 978-0-691-12918-1.
- Bernstein, B.; Toupin, R. A. (1962). "Some Properties of the Hessian Matrix of a Strictly Convex Function". Journal für die reine und angewandte Mathematik 210: 67–72. doi:10.1515/crll.1962.210.65.
External links
- Hazewinkel, Michiel, ed. (2001), "Positive-definite form", Encyclopedia of Mathematics, Springer Science+Business Media B.V. / Kluwer Academic Publishers, ISBN 978-1-55608-010-4, https://www.encyclopediaofmath.org/index.php?title=p/p073880
- Wolfram MathWorld: Positive Definite Matrix
de:Definitheit#Definitheit von Matrizen