Normal-Wishart distribution

Normal-Wishart
Notation	[math]\displaystyle{ (\boldsymbol\mu,\boldsymbol\Lambda) \sim \mathrm{NW}(\boldsymbol\mu_0,\lambda,\mathbf{W},\nu) }[/math]
Parameters	[math]\displaystyle{ \boldsymbol\mu_0\in\mathbb{R}^D\, }[/math] location (vector of real); [math]\displaystyle{ \lambda \gt 0\, }[/math] (real); [math]\displaystyle{ \mathbf{W} \in\mathbb{R}^{D\times D} }[/math] scale matrix (pos. def.); [math]\displaystyle{ \nu \gt D-1\, }[/math] (real)
Support	[math]\displaystyle{ \boldsymbol\mu\in\mathbb{R}^D ; \boldsymbol\Lambda \in\mathbb{R}^{D\times D} }[/math] covariance matrix (pos. def.)
PDF	[math]\displaystyle{ f(\boldsymbol\mu,\boldsymbol\Lambda\|\boldsymbol\mu_0,\lambda,\mathbf{W},\nu) = \mathcal{N}(\boldsymbol\mu\|\boldsymbol\mu_0,(\lambda\boldsymbol\Lambda)^{-1})\ \mathcal{W}(\boldsymbol\Lambda\|\mathbf{W},\nu) }[/math]

In probability theory and statistics, the normal-Wishart distribution (or Gaussian-Wishart distribution) is a multivariate four-parameter family of continuous probability distributions. It is the conjugate prior of a multivariate normal distribution with unknown mean and precision matrix (the inverse of the covariance matrix).^[1]

Definition

Suppose

[math]\displaystyle{ \boldsymbol\mu|\boldsymbol\mu_0,\lambda,\boldsymbol\Lambda \sim \mathcal{N}(\boldsymbol\mu_0,(\lambda\boldsymbol\Lambda)^{-1}) }[/math]

has a multivariate normal distribution with mean [math]\displaystyle{ \boldsymbol\mu_0 }[/math] and covariance matrix [math]\displaystyle{ (\lambda\boldsymbol\Lambda)^{-1} }[/math], where

[math]\displaystyle{ \boldsymbol\Lambda|\mathbf{W},\nu \sim \mathcal{W}(\boldsymbol\Lambda|\mathbf{W},\nu) }[/math]

has a Wishart distribution. Then [math]\displaystyle{ (\boldsymbol\mu,\boldsymbol\Lambda) }[/math] has a normal-Wishart distribution, denoted as

[math]\displaystyle{ (\boldsymbol\mu,\boldsymbol\Lambda) \sim \mathrm{NW}(\boldsymbol\mu_0,\lambda,\mathbf{W},\nu) . }[/math]

Characterization

Probability density function

[math]\displaystyle{ f(\boldsymbol\mu,\boldsymbol\Lambda|\boldsymbol\mu_0,\lambda,\mathbf{W},\nu) = \mathcal{N}(\boldsymbol\mu|\boldsymbol\mu_0,(\lambda\boldsymbol\Lambda)^{-1})\ \mathcal{W}(\boldsymbol\Lambda|\mathbf{W},\nu) }[/math]

Properties

Scaling

Marginal distributions

By construction, the marginal distribution over [math]\displaystyle{ \boldsymbol\Lambda }[/math] is a Wishart distribution, and the conditional distribution over [math]\displaystyle{ \boldsymbol\mu }[/math] given [math]\displaystyle{ \boldsymbol\Lambda }[/math] is a multivariate normal distribution. The marginal distribution over [math]\displaystyle{ \boldsymbol\mu }[/math] is a multivariate t-distribution.

Posterior distribution of the parameters

After making [math]\displaystyle{ n }[/math] observations [math]\displaystyle{ \boldsymbol{x}_1, \dots, \boldsymbol{x}_n }[/math], the posterior distribution of the parameters is

[math]\displaystyle{ (\boldsymbol\mu,\boldsymbol\Lambda) \sim \mathrm{NW}(\boldsymbol\mu_n,\lambda_n,\mathbf{W}_n,\nu_n), }[/math]

where

[math]\displaystyle{ \lambda_n = \lambda + n, }[/math]

[math]\displaystyle{ \boldsymbol\mu_n = \frac{\lambda \boldsymbol\mu_0 + n\boldsymbol{\bar{x}}}{\lambda + n}, }[/math]

[math]\displaystyle{ \nu_n = \nu + n, }[/math]

[math]\displaystyle{ \mathbf{W}_n^{-1} = \mathbf{W}^{-1} + \sum_{i=1}^n (\boldsymbol{x}_i - \boldsymbol{\bar{x}})(\boldsymbol{x}_i - \boldsymbol{\bar{x}})^T + \frac{n \lambda}{n + \lambda} (\boldsymbol{\bar{x}} - \boldsymbol\mu_0)(\boldsymbol{\bar{x}} - \boldsymbol\mu_0)^T. }[/math]^[2]

Generating normal-Wishart random variates

Generation of random variates is straightforward:

Sample [math]\displaystyle{ \boldsymbol\Lambda }[/math] from a Wishart distribution with parameters [math]\displaystyle{ \mathbf{W} }[/math] and [math]\displaystyle{ \nu }[/math]
Sample [math]\displaystyle{ \boldsymbol\mu }[/math] from a multivariate normal distribution with mean [math]\displaystyle{ \boldsymbol\mu_0 }[/math] and variance [math]\displaystyle{ (\lambda\boldsymbol\Lambda)^{-1} }[/math]

Related distributions

The normal-inverse Wishart distribution is essentially the same distribution parameterized by variance rather than precision.
The normal-gamma distribution is the one-dimensional equivalent.
The multivariate normal distribution and Wishart distribution are the component distributions out of which this distribution is made.

Notes

↑ Bishop, Christopher M. (2006). Pattern Recognition and Machine Learning. Springer Science+Business Media. Page 690.
↑ Cross Validated, https://stats.stackexchange.com/q/324925

References

Bishop, Christopher M. (2006). Pattern Recognition and Machine Learning. Springer Science+Business Media.

0.00

(0 votes)

Original source: https://en.wikipedia.org/wiki/Normal-Wishart distribution. Read more

[bishop-1] Bishop, Christopher M. (2006). Pattern Recognition and Machine Learning. Springer Science+Business Media. Page 690.

[2] Cross Validated, https://stats.stackexchange.com/q/324925

[1]

[2]

Anonymous

Search

Normal-Wishart distribution

Namespaces

More

Page actions

Contents

Definition

Characterization

Probability density function

Properties

Scaling

Marginal distributions

Posterior distribution of the parameters

Generating normal-Wishart random variates

Related distributions

Notes

References

Navigation

Navigation

Help

Translate

Wiki tools

Wiki tools

Anonymous

Search

Normal-Wishart distribution

Definition

Characterization

Probability density function

Properties

Scaling

Marginal distributions

Posterior distribution of the parameters

Generating normal-Wishart random variates

Related distributions

Notes

References

Navigation

Wiki tools

Page tools

Other projects

Categories