Matrix F-distribution

From HandWiki
Short description: Multivariate continuous probability distribution


Matrix [math]\displaystyle{ F }[/math]
Notation [math]\displaystyle{ \mathcal{F}({\mathbf\Psi},\nu,\delta) }[/math]
Parameters [math]\displaystyle{ \mathbf{\Psi} \gt 0 }[/math], [math]\displaystyle{ p\times p }[/math] scale matrix (pos. def.)
[math]\displaystyle{ \nu \gt p-1 }[/math] degrees of freedom (real)
[math]\displaystyle{ \delta \gt 0 }[/math] degrees of freedom (real)
Support [math]\displaystyle{ \mathbf{X} }[/math] is p × p positive definite matrix
PDF

[math]\displaystyle{ \frac{\Gamma_p\left(\frac{\nu+\delta+p-1}{2}\right)}{\Gamma_p\left(\frac{\nu}{2}\right)\Gamma_k\left(\frac{\delta+p-1}{2}\right)|\mathbf{\Psi}|^{\frac{\nu}{2}}}~|{\mathbf X}|^{\frac{\nu-p-1}{2}} |\textbf{I}_p+{\mathbf X}\mathbf{\Psi}^{-1}|^{-\frac{\nu+\delta+p-1}{2}} }[/math]

Mean [math]\displaystyle{ \tfrac{\nu}{\delta - 2}\mathbf{\Psi} }[/math], for [math]\displaystyle{ \delta \gt 2. }[/math]
Variance see below

In statistics, the matrix F distribution (or matrix variate F distribution) is a matrix variate generalization of the F distribution which is defined on real-valued positive-definite matrices. In Bayesian statistics it can be used as the semi conjugate prior for the covariance matrix or precision matrix of multivariate normal distributions, and related distributions.[1][2][3][4]

Density

The probability density function of the matrix [math]\displaystyle{ F }[/math] distribution is:

[math]\displaystyle{ f_{\mathbf X}({\mathbf X}; {\mathbf \Psi}, \nu, \delta) = \frac{\Gamma_p\left(\frac{\nu+\delta+p-1}{2}\right)}{\Gamma_p\left(\frac{\nu}{2}\right)\Gamma_k\left(\frac{\delta+p-1}{2}\right)|\mathbf{\Psi}|^{\frac{\nu}{2}}}~|{\mathbf X}|^{\frac{\nu-p-1}{2}} |\textbf{I}_p+{\mathbf X}\mathbf{\Psi}^{-1}|^{-\frac{\nu+\delta+p-1}{2}} }[/math]

where [math]\displaystyle{ \mathbf{X} }[/math] and [math]\displaystyle{ {\mathbf\Psi} }[/math] are [math]\displaystyle{ p\times p }[/math] positive definite matrices, [math]\displaystyle{ | \cdot | }[/math] is the determinant, Γp(·) is the multivariate gamma function, and [math]\displaystyle{ \textbf{I}_p }[/math] is the p × p identity matrix.

Properties

Construction of the distribution

  • The standard matrix F distribution, with an identity scale matrix [math]\displaystyle{ \mathbf I_p }[/math], was originally derived by.[1] When considering independent distributions,

[math]\displaystyle{ {\mathbf \Phi_1}\sim \mathcal{W}({\mathbf I_p},\nu) }[/math] and [math]\displaystyle{ {\mathbf \Phi_2}\sim \mathcal{W}({\mathbf I_p},\delta+k-1) }[/math], and define [math]\displaystyle{ \mathbf X = {\mathbf \Phi_2}^{-1/2}{\mathbf \Phi_1}{\mathbf \Phi_2}^{-1/2} }[/math], then [math]\displaystyle{ \mathbf X\sim \mathcal{F}({\mathbf I_p},\nu,\delta) }[/math].

  • If [math]\displaystyle{ {\mathbf X}|\mathbf\Phi\sim \mathcal{W}^{-1}({\mathbf\Phi},\delta+p-1) }[/math] and [math]\displaystyle{ {\mathbf \Phi}\sim \mathcal{W}({\mathbf\Psi},\nu) }[/math], then, after integrating out [math]\displaystyle{ \mathbf\Phi }[/math], [math]\displaystyle{ \mathbf X }[/math] has a matrix F-distribution, i.e.,

[math]\displaystyle{ f_{\mathbf X | \mathbf\Phi, \nu, \delta}(\mathbf X) = \int f_{\mathbf X | \mathbf\Phi, \delta+p-1}(\mathbf X) f_{\mathbf\Phi | \mathbf\Psi, \nu}(\mathbf\Phi) d\mathbf\Phi. }[/math]
This construction is useful to construct a semi-conjugate prior for a covariance matrix.[3]

  • If [math]\displaystyle{ {\mathbf X}|\mathbf\Phi\sim \mathcal{W}({\mathbf\Phi},\nu) }[/math] and [math]\displaystyle{ {\mathbf \Phi}\sim \mathcal{W}^{-1}({\mathbf\Psi},\delta+p-1) }[/math], then, after integrating out [math]\displaystyle{ \mathbf\Phi }[/math], [math]\displaystyle{ \mathbf X }[/math] has a matrix F-distribution, i.e.,
    [math]\displaystyle{ f_{\mathbf X | \mathbf\Psi, \nu, \delta}(\mathbf X) = \int f_{\mathbf X | \mathbf\Phi, \nu}(\mathbf X) f_{\mathbf\Phi | \mathbf\Psi, \delta + p - 1}(\mathbf\Phi) d\mathbf\Phi. }[/math]
    This construction is useful to construct a semi-conjugate prior for a precision matrix.[4]

Marginal distributions from a matrix F distributed matrix

Suppose [math]\displaystyle{ {\mathbf A}\sim F({\mathbf\Psi},\nu,\delta) }[/math] has a matrix F distribution. Partition the matrices [math]\displaystyle{ {\mathbf A} }[/math] and [math]\displaystyle{ {\mathbf\Psi} }[/math] conformably with each other

[math]\displaystyle{ {\mathbf{A}} = \begin{bmatrix} \mathbf{A}_{11} & \mathbf{A}_{12} \\ \mathbf{A}_{21} & \mathbf{A}_{22} \end{bmatrix}, \; {\mathbf{\Psi}} = \begin{bmatrix} \mathbf{\Psi}_{11} & \mathbf{\Psi}_{12} \\ \mathbf{\Psi}_{21} & \mathbf{\Psi}_{22} \end{bmatrix} }[/math]

where [math]\displaystyle{ {\mathbf A_{ij}} }[/math] and [math]\displaystyle{ {\mathbf \Psi_{ij}} }[/math] are [math]\displaystyle{ p_{i}\times p_{j} }[/math] matrices, then we have [math]\displaystyle{ {\mathbf A_{11} } \sim F({\mathbf \Psi_{11} }, \nu, \delta) }[/math].

Moments

Let [math]\displaystyle{ X \sim F({\mathbf\Psi},\nu,\delta) }[/math].

The mean is given by: [math]\displaystyle{ E(\mathbf X) = \frac{\nu}{\delta-2}\mathbf\Psi. }[/math]

The (co)variance of elements of [math]\displaystyle{ \mathbf{X} }[/math] are given by:[3]

[math]\displaystyle{ \operatorname{cov}(X_{ij},X_{ml}) = \Psi_{ij}\Psi_{ml}\tfrac{2\nu^2+2\nu(\delta-2)}{(\delta-1)(\delta-2)^2(\delta-4)} + (\Psi_{il}\Psi_{jm}+\Psi_{im}\Psi_{jl})\left(\tfrac{2\nu+\nu^2(\delta-2)+\nu(\delta-2)}{(\delta-1)(\delta-2)^2(\delta-4)}+\tfrac{\nu}{(\delta-2)^2}\right). }[/math]

Related distributions

  • The matrix F-distribution has also been termed the multivariate beta II distribution.[5] See also,[6] for a univariate version.
  • A univariate version of the matrix F distribution is the F-distribution. With [math]\displaystyle{ p=1 }[/math] (i.e. univariate) and [math]\displaystyle{ \mathbf\Psi = 1 }[/math], and [math]\displaystyle{ x=\mathbf{X} }[/math], the probability density function of the matrix F distribution becomes the univariate (unscaled) F distribution:
    [math]\displaystyle{ f_{x\mid\nu, \delta}(x) = \operatorname{B}\left(\tfrac{\nu}{2},\tfrac{\delta}{2}\right)^{-1} \left(\tfrac{\nu}{\delta}\right)^{\nu/2} x^{\nu/2 - 1} \left(1+\tfrac{\nu}{\delta} \, x \right)^{-(\nu+\delta)/2}, }[/math]
  • In the univariate case, with [math]\displaystyle{ p=1 }[/math] and [math]\displaystyle{ x=\mathbf{X} }[/math], and when setting [math]\displaystyle{ \nu=1 }[/math], then [math]\displaystyle{ \sqrt{x} }[/math] follows a half t distribution with scale parameter [math]\displaystyle{ \sqrt{\psi} }[/math] and degrees of freedom [math]\displaystyle{ \delta }[/math]. The half t distribution is a common prior for standard deviations[7]

See also

References

  1. 1.0 1.1 Olkin, Ingram; Rubin, Herman (1964-03-01). "Multivariate Beta Distributions and Independence Properties of the Wishart Distribution" (in en). The Annals of Mathematical Statistics 35 (1): 261–269. doi:10.1214/aoms/1177703748. ISSN 0003-4851. http://projecteuclid.org/euclid.aoms/1177703748. 
  2. Dawid, A. P. (1981). "Some matrix-variate distribution theory: Notational considerations and a Bayesian application" (in en). Biometrika 68 (1): 265–274. doi:10.1093/biomet/68.1.265. ISSN 0006-3444. https://academic.oup.com/biomet/article-lookup/doi/10.1093/biomet/68.1.265. 
  3. 3.0 3.1 3.2 Mulder, Joris; Pericchi, Luis Raúl (2018-12-01). "The Matrix-F Prior for Estimating and Testing Covariance Matrices". Bayesian Analysis 13 (4). doi:10.1214/17-BA1092. ISSN 1936-0975. 
  4. 4.0 4.1 Williams, Donald R.; Mulder, Joris (2020-12-01). "Bayesian hypothesis testing for Gaussian graphical models: Conditional independence and order constraints" (in en). Journal of Mathematical Psychology 99: 102441. doi:10.1016/j.jmp.2020.102441. 
  5. Tan, W. Y. (1969-03-01). "Note on the Multivariate and the Generalized Multivariate Beta Distributions" (in en). Journal of the American Statistical Association 64 (325): 230–241. doi:10.1080/01621459.1969.10500966. ISSN 0162-1459. http://www.tandfonline.com/doi/abs/10.1080/01621459.1969.10500966. 
  6. Pérez, María-Eglée; Pericchi, Luis Raúl; Ramírez, Isabel Cristina (2017-09-01). "The Scaled Beta2 Distribution as a Robust Prior for Scales". Bayesian Analysis 12 (3). doi:10.1214/16-BA1015. ISSN 1936-0975. 
  7. Gelman, Andrew (2006-09-01). "Prior distributions for variance parameters in hierarchical models (comment on article by Browne and Draper)". Bayesian Analysis 1 (3). doi:10.1214/06-BA117A. ISSN 1936-0975.