Normal-inverse-gamma distribution

From HandWiki
Short description: Family of multivariate continuous probability distributions
normal-inverse-gamma
Probability density function
Probability density function of normal-inverse-gamma distribution for α = 1.0, 2.0 and 4.0, plotted in shifted and scaled coordinates.
Parameters μ location (real)
λ>0 (real)
α>0 (real)
β>0 (real)
Support x(,),σ2(0,)
PDF λ2πσ2βαΓ(α)(1σ2)α+1exp(2β+λ(xμ)22σ2)
Mean

E[x]=μ

E[σ2]=βα1, for α>1
Mode

x=μ(univariate),x=μ(multivariate)

σ2=βα+1+1/2(univariate),σ2=βα+1+k/2(multivariate)
Variance

Var[x]=β(α1)λ, for α>1
Var[σ2]=β2(α1)2(α2), for α>2

Cov[x,σ2]=0, for α>1

In probability theory and statistics, the normal-inverse-gamma distribution (or Gaussian-inverse-gamma distribution) is a four-parameter family of multivariate continuous probability distributions. It is the conjugate prior of a normal distribution with unknown mean and variance.

Definition

Suppose

xσ2,μ,λN(μ,σ2/λ)

has a normal distribution with mean μ and variance σ2/λ, where

σ2α,βΓ1(α,β)

has an inverse-gamma distribution. Then (x,σ2) has a normal-inverse-gamma distribution, denoted as

(x,σ2)N-Γ1(μ,λ,α,β).

(NIG is also used instead of N-Γ1.)

The normal-inverse-Wishart distribution is a generalization of the normal-inverse-gamma distribution that is defined over multivariate random variables.

Characterization

Probability density function

f(x,σ2μ,λ,α,β)=λσ2πβαΓ(α)(1σ2)α+1exp(2β+λ(xμ)22σ2)

For the multivariate form where 𝐱 is a k×1 random vector,

f(𝐱,σ2μ,𝐕1,α,β)=|𝐕|1/2(2π)k/2βαΓ(α)(1σ2)α+1+k/2exp(2β+(𝐱μ)T𝐕1(𝐱μ)2σ2).

where |𝐕| is the determinant of the k×k matrix 𝐕. Note how this last equation reduces to the first form if k=1 so that 𝐱,𝐕,μ are scalars.

Alternative parameterization

It is also possible to let γ=1/λ in which case the pdf becomes

f(x,σ2μ,γ,α,β)=1σ2πγβαΓ(α)(1σ2)α+1exp(2γβ+(xμ)22γσ2)

In the multivariate form, the corresponding change would be to regard the covariance matrix 𝐕 instead of its inverse 𝐕1 as a parameter.

Cumulative distribution function

F(x,σ2μ,λ,α,β)=eβσ2(βσ2)α(erf(λ(xμ)2σ)+1)2σ2Γ(α)

Properties

Marginal distributions

Given (x,σ2)N-Γ1(μ,λ,α,β). as above, σ2 by itself follows an inverse gamma distribution:

σ2Γ1(α,β)

while αλβ(xμ) follows a t distribution with 2α degrees of freedom.[1]

In the multivariate case, the marginal distribution of 𝐱 is a multivariate t distribution:

𝐱t2α(μ,βα𝐕)

Summation

Scaling

Suppose

(x,σ2)N-Γ1(μ,λ,α,β).

Then for c>0,

(cx,cσ2)N-Γ1(cμ,λ/c,α,cβ).

Proof: To prove this let (x,σ2)N-Γ1(μ,λ,α,β) and fix c>0. Defining Y=(Y1,Y2)=(cx,cσ2), observe that the PDF of the random variable Y evaluated at (y1,y2) is given by 1/c2 times the PDF of a N-Γ1(μ,λ,α,β) random variable evaluated at (y1/c,y2/c). Hence the PDF of Y evaluated at (y1,y2) is given by :fY(y1,y2)=1c2λ2πy2/cβαΓ(α)(1y2/c)α+1exp(2β+λ(y1/cμ)22y2/c)=λ/c2πy2(cβ)αΓ(α)(1y2)α+1exp(2cβ+(λ/c)(y1cμ)22y2).

The right hand expression is the PDF for a N-Γ1(cμ,λ/c,α,cβ) random variable evaluated at (y1,y2), which completes the proof.

Exponential family

Normal-inverse-gamma distributions form an exponential family with natural parameters θ1=λ2, θ2=λμ, θ3=α, and θ4=β+λμ22 and sufficient statistics T1=x2σ2, T2=xσ2, T3=log(1σ2), and T4=1σ2.

Information entropy

Kullback–Leibler divergence

Measures difference between two distributions.

Maximum likelihood estimation

Posterior distribution of the parameters

See the articles on normal-gamma distribution and conjugate prior.

Interpretation of the parameters

See the articles on normal-gamma distribution and conjugate prior.

Generating normal-inverse-gamma random variates

Generation of random variates is straightforward:

  1. Sample σ2 from an inverse gamma distribution with parameters α and β
  2. Sample x from a normal distribution with mean μ and variance σ2/λ
  • The normal-gamma distribution is the same distribution parameterized by precision rather than variance
  • A generalization of this distribution which allows for a multivariate mean and a completely unknown positive-definite covariance matrix σ2𝐕 (whereas in the multivariate inverse-gamma distribution the covariance matrix is regarded as known up to the scale factor σ2) is the normal-inverse-Wishart distribution

See also

References

  • Denison, David G. T. et al. (2002). Bayesian Methods for Nonlinear Classification and Regression. Wiley. ISBN 0471490369. 
  • Koch, Karl-Rudolf (2007). Introduction to Bayesian Statistics (2nd ed.). Springer. ISBN 354072723X.