Sylvester's criterion

Short description: Criterion of positive definiteness of a matrix

In mathematics, Sylvester’s criterion is a necessary and sufficient criterion to determine whether a Hermitian matrix is positive-definite. It is named after James Joseph Sylvester.

Sylvester's criterion states that a n × n Hermitian matrix M is positive-definite if and only if all the following matrices have a positive determinant:

the upper left 1-by-1 corner of M,
the upper left 2-by-2 corner of M,
the upper left 3-by-3 corner of M,
[math]\displaystyle{ {}\quad\vdots }[/math]
M itself.

In other words, all of the leading principal minors must be positive. By using appropriate permutations of rows and columns of M, it can also be shown that the positivity of any nested sequence of n principal minors of M is equivalent to M being positive-definite.^[1]

An analogous theorem holds for characterizing positive-semidefinite Hermitian matrices, except that it is no longer sufficient to consider only the leading principal minors: a Hermitian matrix M is positive-semidefinite if and only if all principal minors of M are nonnegative.^[2]^[3]

Simple proof for special case

Suppose [math]\displaystyle{ M_n }[/math]is [math]\displaystyle{ n\times n }[/math] Hermitian matrix [math]\displaystyle{ M_n^{\dagger}=M_n }[/math]. Let [math]\displaystyle{ M_k,k=1,\ldots n }[/math] be the principal minor matrices, i.e. the [math]\displaystyle{ k\times k }[/math] upper left corner matrices. It will be shown that if [math]\displaystyle{ M_n }[/math] is positive definite, then the principal minors are positive; that is, [math]\displaystyle{ \det M_k\gt 0 }[/math] for all [math]\displaystyle{ k }[/math].

[math]\displaystyle{ M_k }[/math] is positive definite. Indeed, choosing

[math]\displaystyle{ x=\left( \begin{array}{c} x_1\\ \vdots\\ x_k\\ 0\\ \vdots\\0\end{array} \right) = \left( \begin{array}{c} \vec{x}\\ 0\\ \vdots\\ 0 \end{array} \right) }[/math]

we can notice that [math]\displaystyle{ 0\lt x^{\dagger} M_n x=\vec{x}^{\dagger}M_k\vec{x}. }[/math] Equivalently, the eigenvalues of [math]\displaystyle{ M_k }[/math] are positive, and this implies that [math]\displaystyle{ \det M_k\gt 0 }[/math] since the determinant is the product of the eigenvalues.

To prove the reverse implication, we use induction. The general form of an [math]\displaystyle{ (n+1) \times (n+1) }[/math] Hermitian matrix is

[math]\displaystyle{ M_{n+1}= \left( \begin{array}{cc}M_n&\vec{v}\\ \vec{v}^{\dagger}&d\end{array}\right) \qquad (*) }[/math],

where [math]\displaystyle{ M_n }[/math] is an [math]\displaystyle{ n \times n }[/math] Hermitian matrix, [math]\displaystyle{ \vec{v} }[/math] is a vector and [math]\displaystyle{ d }[/math] is a real constant.

Suppose the criterion holds for [math]\displaystyle{ M_n }[/math]. Assuming that all the principal minors of [math]\displaystyle{ M_{n+1} }[/math] are positive implies that [math]\displaystyle{ \det M_{n+1}\gt 0 }[/math], [math]\displaystyle{ \det M_n\gt 0 }[/math], and that [math]\displaystyle{ M_n }[/math] is positive definite by the inductive hypothesis. Denote

[math]\displaystyle{ x=\left( \begin{array}{c}\vec{x}\\ x_{n+1}\end{array} \right) }[/math]

then

[math]\displaystyle{ x^{\dagger}M_{n+1}x=\vec{x}^{\dagger}M_n\vec{x}+x_{n+1}\vec{x}^{\dagger}\vec{v}+\bar{x}_{n+1}\vec{v}^{\dagger}\vec{x}+d|x_{n+1}|^2 }[/math]

By completing the squares, this last expression is equal to

[math]\displaystyle{ (\vec{x}^{\dagger}+\vec{v}^{\dagger}M_n^{-1}\bar{x}_{n+1})M_n(\vec{x}+x_{n+1}M_n^{-1}\vec{v})-|x_{n+1}|^2\vec{v}^{\dagger}M_n^{-1}\vec{v}+d|x_{n+1}|^2 }[/math]

[math]\displaystyle{ =(\vec{x}+\vec{c})^{\dagger}M_n(\vec{x}+\vec{c})+|x_{n+1}|^2(d-\vec{v}^{\dagger}M_n^{-1}\vec{v}) }[/math]

where [math]\displaystyle{ \vec{c}=x_{n+1}M_n^{-1}\vec{v} }[/math] (note that [math]\displaystyle{ M_n^{-1} }[/math] exists because the eigenvalues of [math]\displaystyle{ M_n }[/math] are all positive.) The first term is positive by the inductive hypothesis. We now examine the sign of the second term. By using the block matrix determinant formula

[math]\displaystyle{ \det \left(\begin{array}{cc}A&B\\C&D\end{array} \right)=\det A\det(D-CA^{-1}B) }[/math]

on [math]\displaystyle{ (*) }[/math] we obtain

[math]\displaystyle{ \det M_{n+1}=\det M_n(d-\vec{v}^{\dagger}M_n^{-1}\vec{v})\gt 0 }[/math], which implies [math]\displaystyle{ d-\vec{v}^{\dagger}M_n^{-1}\vec{v}\gt 0 }[/math].

Consequently, [math]\displaystyle{ x^{\dagger}M_{n+1}x\gt 0. }[/math]

Proof for general case

The previous proof is only for nonsingular Hermitian matrix with coefficients in [math]\displaystyle{ \mathbb{R} }[/math], and therefore only for nonsingular real-symmetric matrices.

Theorem I: A real-symmetric matrix A has nonnegative eigenvalues if and only if A can be factored as A = B^TB, and all eigenvalues are positive if and only if B is nonsingular.^[4]

Proof:

Forward implication: If A ∈ R^n×n is symmetric, then, by the spectral theorem, there is an orthogonal matrix P such that A = PDP^T , where D = diag (λ₁, λ₂, . . . , λ_n) is real diagonal matrix with entries being eigenvalues of A and P is such that its columns are the eigenvectors of A. If λ_i ≥ 0 for each i, then D^1/2 exists, so A = PDP^T = PD^1/2D^1/2P^T = B^TB for B = D^1/2P^T, and λ_i > 0 for each i if and only if B is nonsingular.

Reverse implication: Conversely, if A can be factored as A = B^TB, then all eigenvalues of A are nonnegative because for any eigenpair (λ, x):

[math]\displaystyle{ \lambda = {\frac{x^TAx}{x^Tx}}={\frac{x^TB^TBx}{x^Tx}}={\frac{\|Bx\|^2}{\|x\|^2}} \ge 0. }[/math]

Theorem II (The Cholesky decomposition): The symmetric matrix A possesses positive pivots if and only if A can be uniquely factored as A = R^TR, where R is an upper-triangular matrix with positive diagonal entries. This is known as the Cholesky decomposition of A, and R is called the Cholesky factor of A.^[5]

Proof:

Forward implication: If A possesses positive pivots (therefore A possesses an LU factorization: A = L·U'), then, it has an LDU factorization A = LDU = LDL^T in which D = diag(u₁₁, u₂₂, . . . , u_nn) is the diagonal matrix containing the pivots u_ii > 0.

[math]\displaystyle{ \begin{align} \mathbf{A} & =LU'= \begin{bmatrix} 1 & 0 & \cdots & 0\\ \ell_{12} & 1 & \cdots & 0 \\ \vdots & \vdots & & \vdots \\ \ell_{1n} & \ell_{2n} & \cdots & 1 \end{bmatrix} \begin{bmatrix} u_{11} & u_{12} & \cdots & u_{1n}\\ 0 & u_{22} & \cdots & u_{2n} \\ \vdots & \vdots & & \vdots \\ 0 & 0 & \cdots & u_{nn} \end{bmatrix} \\[8pt] & = LDU= \begin{bmatrix} 1 & 0 & \cdots & 0\\ \ell_{12} & 1 & \cdots & 0 \\ \vdots & \vdots & & \vdots \\ \ell_{1n} & \ell_{2n} & \cdots & 1 \end{bmatrix} \begin{bmatrix} u_{11} & 0 & \cdots & 0\\ 0 & u_{22} & \cdots & 0 \\ \vdots & \vdots & & \vdots \\ 0 & 0 & \cdots & u_{nn} \end{bmatrix} \begin{bmatrix} 1 & u_{12}/u_{11} & \cdots & u_{1n}/u_{11} \\ 0 & 1 & \cdots & u_{2n}/u_{22} \\ \vdots & \vdots & & \vdots \\ 0 & 0 & \cdots & 1 \end{bmatrix} \end{align} }[/math]

By a uniqueness property of the LDU decomposition, the symmetry of A yields: U = L^T, consequently A = LDU = LDL^T. Setting R = D^1/2L^T where D^1/2 = diag([math]\displaystyle{ \scriptstyle\sqrt{u_{11}},\scriptstyle\sqrt{u_{22}},\ldots,\scriptstyle\sqrt{u_{11}} }[/math]) yields the desired factorization, because A = LD^1/2D^1/2L^T = R^TR, and R is upper triangular with positive diagonal entries.

Reverse implication: Conversely, if A = RR^T, where R is lower triangular with a positive diagonal, then factoring the diagonal entries out of R is as follows:

[math]\displaystyle{ \mathbf{R} =LD= \begin{bmatrix} 1 & 0 & \cdots & 0\\ r_{12}/r_{11} & 1 & \cdots & 0 \\ \vdots & \vdots & & \vdots \\ r_{1n}/r_{11} & r_{2n}/r_{22} & \cdots & 1 \end{bmatrix} \begin{bmatrix} r_{11} & 0 & \cdots & 0\\ 0 & r_{22} & \cdots & 0 \\ \vdots & \vdots & & \vdots \\ 0 & 0 & \cdots & r_{nn} \end{bmatrix}. }[/math]

R = LD, where L is a lower triangular matrix with a unit diagonal and D is the diagonal matrix whose diagonal entries are the r_ii ’s. Hence D has a positive diagonal and hence D is non-singular. Hence D² is a non-singular diagonal matrix. Also, L^T is an upper triangular matrix with a unit diagonal. Consequently, A = LD²L^T is the LDU factorization for A, and thus the pivots must be positive because they are the diagonal entries in D².

Uniqueness of the Cholesky decomposition: If we have another Cholesky decomposition A = R₁R₁^T of A, where R₁ is lower triangular with a positive diagonal, then similar to the above we may write R₁ = L₁D₁, where L₁ is a lower triangular matrix with a unit diagonal and D₁ is a diagonal matrix whose diagonal entries are the same as the corresponding diagonal entries of R₁. Consequently, A = L₁D₁²L₁^T is an LDU factorization for A. By the uniqueness of the LDU factorization of A, we have L₁ = L and D₁² = D². As both D₁ and D are diagonal matrices with positive diagonal entries, we have D₁ = D. Hence R₁ = L₁D₁ = LD = R. Hence A has a unique Cholesky decomposition.

Theorem III: Let A_k be the k × k leading principal submatrix of A_n×n. If A has an LU factorization A = LU, where L is a lower triangular matrix with a unit diagonal, then det(A_k) = u₁₁u₂₂ · · · u_kk, and the k-th pivot is u_kk = det(A₁) = a₁₁ for k = 1, u_kk = det(A_k)/det(A_k−1) for k = 2, 3, . . . , n, where u_kk is the (k, k)-th entry of U for all k = 1, 2, . . . , n.^[6]

Combining Theorem II with Theorem III yields:

Statement I: If the symmetric matrix A can be factored as A=R^TR where R is an upper-triangular matrix with positive diagonal entries, then all the pivots of A are positive (by Theorem II), therefore all the leading principal minors of A are positive (by Theorem III).

Statement II: If the nonsingular n × n symmetric matrix A can be factored as [math]\displaystyle{ A=B^TB }[/math], then the QR decomposition (closely related to Gram-Schmidt process) of B (B = QR) yields: [math]\displaystyle{ A=B^TB=R^TQ^TQR=R^TR }[/math], where Q is orthogonal matrix and R is upper triangular matrix.

As A is non-singular and [math]\displaystyle{ A=R^TR }[/math], it follows that all the diagonal entries of R are non-zero. Let r_jj be the (j, j)-th entry of R for all j = 1, 2, . . . , n. Then r_jj ≠ 0 for all j = 1, 2, . . . , n.

Let F be a diagonal matrix, and let f_jj be the (j, j)-th entry of F for all j = 1, 2, . . . , n. For all j = 1, 2, . . . , n, we set f_jj = 1 if r_jj > 0, and we set f_jj = -1 if r_jj < 0. Then [math]\displaystyle{ F^TF=I_n }[/math], the n × n identity matrix.

Let S=FR. Then S is an upper-triangular matrix with all diagonal entries being positive. Hence we have [math]\displaystyle{ A=R^TR=R^TF^TFR=S^TS }[/math], for some upper-triangular matrix S with all diagonal entries being positive.

Namely Statement II requires the non-singularity of the symmetric matrix A.

Combining Theorem I with Statement I and Statement II yields:

Statement III: If the real-symmetric matrix A is positive definite then A possess factorization of the form A = B^TB, where B is nonsingular (Theorem I), the expression A = B^TB implies that A possess factorization of the form A = R^TR where R is an upper-triangular matrix with positive diagonal entries (Statement II), therefore all the leading principal minors of A are positive (Statement I).

In other words, Statement III proves the "only if" part of Sylvester's Criterion for non-singular real-symmetric matrices.

Sylvester's Criterion: The real-symmetric matrix A is positive definite if and only if all the leading principal minors of A are positive.

Proof for the case of positive semidefinite matrices

Let [math]\displaystyle{ M_n }[/math] be an n x n Hermitian matrix. Suppose [math]\displaystyle{ M_n }[/math] is semidefinite. Essentially the same proof as for the case that [math]\displaystyle{ M_n }[/math] is strictly positive definite shows that all principal minors (not necessarily the leading principal minors) are non-negative.

For the reverse implication, it suffices to show that if [math]\displaystyle{ M_n }[/math] then for all t>0, all leading principal minors of the Hermitian matrix [math]\displaystyle{ M_n+tI_n }[/math] are strictly positive, where [math]\displaystyle{ I_n }[/math] is the nxn identity matrix. Indeed, from the positive definite case, we would know that the matrices [math]\displaystyle{ M_n+tI_n }[/math] are strictly positive definite. Since the limit of positive definite matrices is always positive semidefinite, we can take [math]\displaystyle{ t \to 0 }[/math] to conclude.

To show this, let [math]\displaystyle{ M_k }[/math] be the kth leading principal submatrix of [math]\displaystyle{ M_n. }[/math] We know that [math]\displaystyle{ q_k(t) = \det(M_k + tI_k) }[/math] is a polynomial in t, related to the characteristic polynomial [math]\displaystyle{ p_{M_k} }[/math] via [math]\displaystyle{ q_k(t) = (-1)^kp_{M_k}(-t). }[/math] We use the identity in Characteristic polynomial to write [math]\displaystyle{ q_k(t) = \sum_{j=0}^k t^{k-j} \operatorname{tr}\left(\textstyle\bigwedge^j M_k\right), }[/math] where [math]\displaystyle{ \operatorname{tr}\left(\bigwedge^j M_k\right) }[/math] is the trace of the jth exterior power of [math]\displaystyle{ M_k. }[/math]

From Minor (linear algebra), we know that the entries in the matrix expansion of [math]\displaystyle{ \bigwedge^j M_k }[/math] (for j > 0) are just the minors of [math]\displaystyle{ M_k. }[/math] In particular, the diagonal entries are the principal minors of [math]\displaystyle{ M_k }[/math], which of course are also principal minors of [math]\displaystyle{ M_n }[/math], and are thus non-negative. Since the trace of a matrix is the sum of the diagonal entries, it follows that [math]\displaystyle{ \operatorname{tr}\left(\textstyle\bigwedge^j M_k\right) \geq 0. }[/math] Thus the coefficient of [math]\displaystyle{ t^{k-j} }[/math] in [math]\displaystyle{ q_k(t) }[/math] is non-negative for all j > 0. For j = 0, it is clear that the coefficient is 1. In particular, [math]\displaystyle{ q_k(t) \gt 0 }[/math] for all t > 0, which is what was required to show.

Notes

↑ Horn, Roger A.; Johnson, Charles R. (1985), Matrix Analysis, Cambridge University Press, ISBN 978-0-521-38632-6 . See Theorem 7.2.5.
↑ Carl D. Meyer, Matrix Analysis and Applied Linear Algebra. See section 7.6 Positive Definite Matrices, page 566
↑ Prussing, John E. (1986), "The Principal Minor Test for Semidefinite Matrices", Journal of Guidance, Control, and Dynamics 9 (1): 121–122, doi:10.2514/3.20077, Bibcode: 1986JGCD....9..121P, archived from the original on 2017-01-07, https://web.archive.org/web/20170107084552/http://prussing.ae.illinois.edu/semidef.pdf, retrieved 2017-09-28
↑ Carl D. Meyer, Matrix Analysis and Applied Linear Algebra. See section 7.6 Positive Definite Matrices, page 558
↑ Carl D. Meyer, Matrix Analysis and Applied Linear Algebra. See section 3.10 The LU Factorization, Example 3.10.7, page 154
↑ Carl D. Meyer, Matrix Analysis and Applied Linear Algebra. See section 6.1 Determinants, Exercise 6.1.16, page 474

References

Gilbert, George T. (1991), "Positive definite matrices and Sylvester's criterion", The American Mathematical Monthly (Mathematical Association of America) 98 (1): 44–46, doi:10.2307/2324036, ISSN 0002-9890 .
Horn, Roger A.; Johnson, Charles R. (1985), Matrix Analysis, Cambridge University Press, ISBN 978-0-521-38632-6 . Theorem 7.2.5.
Carl D. Meyer (June 2000), Matrix Analysis and Applied Linear Algebra, SIAM, ISBN 0-89871-454-0 .

fr:Matrice définie positive#Critère de Sylvester

0.00

(0 votes)

Original source: https://en.wikipedia.org/wiki/Sylvester's criterion. Read more

[ref4-1] Horn, Roger A.; Johnson, Charles R. (1985), Matrix Analysis, Cambridge University Press, ISBN 978-0-521-38632-6 . See Theorem 7.2.5.

[ref1b-2] Carl D. Meyer, Matrix Analysis and Applied Linear Algebra. See section 7.6 Positive Definite Matrices, page 566

[prussing-3] Prussing, John E. (1986), "The Principal Minor Test for Semidefinite Matrices", Journal of Guidance, Control, and Dynamics 9 (1): 121–122, doi:10.2514/3.20077, Bibcode: 1986JGCD....9..121P, archived from the original on 2017-01-07, https://web.archive.org/web/20170107084552/http://prussing.ae.illinois.edu/semidef.pdf, retrieved 2017-09-28

[ref1-4] Carl D. Meyer, Matrix Analysis and Applied Linear Algebra. See section 7.6 Positive Definite Matrices, page 558

[ref2-5] Carl D. Meyer, Matrix Analysis and Applied Linear Algebra. See section 3.10 The LU Factorization, Example 3.10.7, page 154

[ref3-6] Carl D. Meyer, Matrix Analysis and Applied Linear Algebra. See section 6.1 Determinants, Exercise 6.1.16, page 474

[1]

[2]

[3]

[4]

[5]

[6]

Anonymous

Search

Sylvester's criterion

Namespaces

More

Page actions

Contents

Simple proof for special case

Proof for general case

Proof for the case of positive semidefinite matrices

Notes

References

Navigation

Navigation

Help

Translate

Wiki tools

Wiki tools

Anonymous

Search

Sylvester's criterion

Simple proof for special case

Proof for general case

Proof for the case of positive semidefinite matrices

Notes

References

Navigation

Wiki tools

Page tools

Other projects

Categories