Schur product theorem
In mathematics, particularly in linear algebra, the Schur product theorem states that the Hadamard product of two positive definite matrices is also a positive definite matrix. The result is named after Issai Schur[1] (Schur 1911, p. 14, Theorem VII) (note that Schur signed as J. Schur in Journal für die reine und angewandte Mathematik.[2][3])
We remark that the converse of the theorem holds in the following sense. If [math]\displaystyle{ M }[/math] is a symmetric matrix and the Hadamard product [math]\displaystyle{ M \circ N }[/math] is positive definite for all positive definite matrices [math]\displaystyle{ N }[/math], then [math]\displaystyle{ M }[/math] itself is positive definite.
Proof
Proof using the trace formula
For any matrices [math]\displaystyle{ M }[/math] and [math]\displaystyle{ N }[/math], the Hadamard product [math]\displaystyle{ M \circ N }[/math] considered as a bilinear form acts on vectors [math]\displaystyle{ a, b }[/math] as
- [math]\displaystyle{ a^* (M \circ N) b = \operatorname{tr}\left(M^\textsf{T} \operatorname{diag}\left(a^*\right) N \operatorname{diag}(b)\right) }[/math]
where [math]\displaystyle{ \operatorname{tr} }[/math] is the matrix trace and [math]\displaystyle{ \operatorname{diag}(a) }[/math] is the diagonal matrix having as diagonal entries the elements of [math]\displaystyle{ a }[/math].
Suppose [math]\displaystyle{ M }[/math] and [math]\displaystyle{ N }[/math] are positive definite, and so Hermitian. We can consider their square-roots [math]\displaystyle{ M^\frac{1}{2} }[/math] and [math]\displaystyle{ N^\frac{1}{2} }[/math], which are also Hermitian, and write
- [math]\displaystyle{ \operatorname{tr}\left(M^\textsf{T} \operatorname{diag}\left(a^*\right) N \operatorname{diag}(b)\right) = \operatorname{tr}\left(\overline{M}^\frac{1}{2} \overline{M}^\frac{1}{2} \operatorname{diag}\left(a^*\right) N^\frac{1}{2} N^\frac{1}{2} \operatorname{diag}(b)\right) = \operatorname{tr}\left(\overline{M}^\frac{1}{2} \operatorname{diag}\left(a^*\right) N^\frac{1}{2} N^\frac{1}{2} \operatorname{diag}(b) \overline{M}^\frac{1}{2}\right) }[/math]
Then, for [math]\displaystyle{ a = b }[/math], this is written as [math]\displaystyle{ \operatorname{tr}\left(A^* A\right) }[/math] for [math]\displaystyle{ A = N^\frac{1}{2} \operatorname{diag}(a) \overline{M}^\frac{1}{2} }[/math] and thus is strictly positive for [math]\displaystyle{ A \neq 0 }[/math], which occurs if and only if [math]\displaystyle{ a \neq 0 }[/math]. This shows that [math]\displaystyle{ (M \circ N) }[/math] is a positive definite matrix.
Proof using Gaussian integration
Case of M = N
Let [math]\displaystyle{ X }[/math] be an [math]\displaystyle{ n }[/math]-dimensional centered Gaussian random variable with covariance [math]\displaystyle{ \langle X_i X_j \rangle = M_{ij} }[/math]. Then the covariance matrix of [math]\displaystyle{ X_i^2 }[/math] and [math]\displaystyle{ X_j^2 }[/math] is
- [math]\displaystyle{ \operatorname{Cov}\left(X_i^2, X_j^2\right) = \left\langle X_i^2 X_j^2 \right\rangle - \left\langle X_i^2 \right\rangle \left\langle X_j^2 \right\rangle }[/math]
Using Wick's theorem to develop [math]\displaystyle{ \left\langle X_i^2 X_j^2 \right\rangle = 2 \left\langle X_i X_j \right\rangle^2 + \left\langle X_i^2 \right\rangle \left\langle X_j^2 \right\rangle }[/math] we have
- [math]\displaystyle{ \operatorname{Cov}\left(X_i^2, X_j^2\right) = 2 \left\langle X_i X_j \right\rangle^2 = 2 M_{ij}^2 }[/math]
Since a covariance matrix is positive definite, this proves that the matrix with elements [math]\displaystyle{ M_{ij}^2 }[/math] is a positive definite matrix.
General case
Let [math]\displaystyle{ X }[/math] and [math]\displaystyle{ Y }[/math] be [math]\displaystyle{ n }[/math]-dimensional centered Gaussian random variables with covariances [math]\displaystyle{ \left\langle X_i X_j \right\rangle = M_{ij} }[/math], [math]\displaystyle{ \left\langle Y_i Y_j \right\rangle = N_{ij} }[/math] and independent from each other so that we have
- [math]\displaystyle{ \left\langle X_i Y_j \right\rangle = 0 }[/math] for any [math]\displaystyle{ i, j }[/math]
Then the covariance matrix of [math]\displaystyle{ X_i Y_i }[/math] and [math]\displaystyle{ X_j Y_j }[/math] is
- [math]\displaystyle{ \operatorname{Cov}\left(X_i Y_i, X_j Y_j\right) = \left\langle X_i Y_i X_j Y_j \right\rangle - \left\langle X_i Y_i \right\rangle \left\langle X_j Y_j \right\rangle }[/math]
Using Wick's theorem to develop
- [math]\displaystyle{ \left\langle X_i Y_i X_j Y_j \right\rangle = \left\langle X_i X_j \right\rangle \left\langle Y_i Y_j \right\rangle + \left\langle X_i Y_i \right\rangle \left\langle X_j Y_j \right\rangle + \left\langle X_i Y_j \right\rangle \left\langle X_j Y_i \right\rangle }[/math]
and also using the independence of [math]\displaystyle{ X }[/math] and [math]\displaystyle{ Y }[/math], we have
- [math]\displaystyle{ \operatorname{Cov}\left(X_i Y_i, X_j Y_j\right) = \left\langle X_i X_j \right\rangle \left\langle Y_i Y_j \right\rangle = M_{ij} N_{ij} }[/math]
Since a covariance matrix is positive definite, this proves that the matrix with elements [math]\displaystyle{ M_{ij} N_{ij} }[/math] is a positive definite matrix.
Proof using eigendecomposition
Proof of positive semidefiniteness
Let [math]\displaystyle{ M = \sum \mu_i m_i m_i^\textsf{T} }[/math] and [math]\displaystyle{ N = \sum \nu_i n_i n_i^\textsf{T} }[/math]. Then
- [math]\displaystyle{ M \circ N = \sum_{ij} \mu_i \nu_j \left(m_i m_i^\textsf{T}\right) \circ \left(n_j n_j^\textsf{T}\right) = \sum_{ij} \mu_i \nu_j \left(m_i \circ n_j\right) \left(m_i \circ n_j\right)^\textsf{T} }[/math]
Each [math]\displaystyle{ \left(m_i \circ n_j\right) \left(m_i \circ n_j\right)^\textsf{T} }[/math] is positive semidefinite (but, except in the 1-dimensional case, not positive definite, since they are rank 1 matrices). Also, [math]\displaystyle{ \mu_i \nu_j \gt 0 }[/math] thus the sum [math]\displaystyle{ M \circ N }[/math] is also positive semidefinite.
Proof of definiteness
To show that the result is positive definite requires even further proof. We shall show that for any vector [math]\displaystyle{ a \neq 0 }[/math], we have [math]\displaystyle{ a^\textsf{T} (M \circ N) a \gt 0 }[/math]. Continuing as above, each [math]\displaystyle{ a^\textsf{T} \left(m_i \circ n_j\right) \left(m_i \circ n_j\right)^\textsf{T} a \ge 0 }[/math], so it remains to show that there exist [math]\displaystyle{ i }[/math] and [math]\displaystyle{ j }[/math] for which corresponding term above is nonzero. For this we observe that
- [math]\displaystyle{ a^\textsf{T} (m_i \circ n_j) (m_i \circ n_j)^\textsf{T} a = \left(\sum_k m_{i,k} n_{j,k} a_k\right)^2 }[/math]
Since [math]\displaystyle{ N }[/math] is positive definite, there is a [math]\displaystyle{ j }[/math] for which [math]\displaystyle{ n_j \circ a \neq 0 }[/math] (since otherwise [math]\displaystyle{ n_j^\textsf{T} a = \sum_k (n_j \circ a)_k = 0 }[/math] for all [math]\displaystyle{ j }[/math]), and likewise since [math]\displaystyle{ M }[/math] is positive definite there exists an [math]\displaystyle{ i }[/math] for which [math]\displaystyle{ \sum_k m_{i,k} (n_j \circ a)_k = m_i^\textsf{T} (n_j \circ a) \neq 0. }[/math] However, this last sum is just [math]\displaystyle{ \sum_k m_{i,k} n_{j,k} a_k }[/math]. Thus its square is positive. This completes the proof.
References
- ↑ Schur, J. (1911). "Bemerkungen zur Theorie der beschränkten Bilinearformen mit unendlich vielen Veränderlichen". Journal für die reine und angewandte Mathematik 1911 (140): 1–28. doi:10.1515/crll.1911.140.1.
- ↑ Zhang, Fuzhen, ed (2005). The Schur Complement and Its Applications. Numerical Methods and Algorithms. 4. doi:10.1007/b105056. ISBN 0-387-24271-6., page 9, Ch. 0.6 Publication under J. Schur
- ↑ Ledermann, W. (1983). "Issai Schur and His School in Berlin". Bulletin of the London Mathematical Society 15 (2): 97–106. doi:10.1112/blms/15.2.97.
External links
- Bemerkungen zur Theorie der beschränkten Bilinearformen mit unendlich vielen Veränderlichen at EUDML
Original source: https://en.wikipedia.org/wiki/Schur product theorem.
Read more |