Scaled inverse chi-squared distribution

Scaled inverse chi-squared
	Probability density function
	Cumulative distribution function
Parameters	[math]\displaystyle{ \nu \gt 0\, }[/math]; [math]\displaystyle{ \tau^2 \gt 0\, }[/math]
Support	[math]\displaystyle{ x \in (0, \infty) }[/math]
PDF	[math]\displaystyle{ \frac{(\tau^2\nu/2)^{\nu/2}}{\Gamma(\nu/2)}~ \frac{\exp\left[ \frac{-\nu \tau^2}{2 x}\right]}{x^{1+\nu/2}} }[/math]
CDF	[math]\displaystyle{ \Gamma\left(\frac{\nu}{2},\frac{\tau^2\nu}{2x}\right) \left/\Gamma\left(\frac{\nu}{2}\right)\right. }[/math]
Mean	[math]\displaystyle{ \frac{\nu \tau^2}{\nu-2} }[/math] for [math]\displaystyle{ \nu \gt 2\, }[/math]
Mode	[math]\displaystyle{ \frac{\nu \tau^2}{\nu+2} }[/math]
Variance	[math]\displaystyle{ \frac{2 \nu^2 \tau^4}{(\nu-2)^2 (\nu-4)} }[/math]for [math]\displaystyle{ \nu \gt 4\, }[/math]
Skewness	[math]\displaystyle{ \frac{4}{\nu-6}\sqrt{2(\nu-4)} }[/math]for [math]\displaystyle{ \nu \gt 6\, }[/math]
Kurtosis	[math]\displaystyle{ \frac{12(5\nu-22)}{(\nu-6)(\nu-8)} }[/math]for [math]\displaystyle{ \nu \gt 8\, }[/math]
Entropy	[math]\displaystyle{ \frac{\nu}{2} \!+\!\ln\left(\frac{\tau^2\nu}{2}\Gamma\left(\frac{\nu}{2}\right)\right) }[/math] [math]\displaystyle{ \!-\!\left(1\!+\!\frac{\nu}{2}\right)\psi\left(\frac{\nu}{2}\right) }[/math]
MGF	[math]\displaystyle{ \frac{2}{\Gamma(\frac{\nu}{2})}\left(\frac{-\tau^2\nu t}{2}\right)^{\!\!\frac{\nu}{4}}\!\!K_{\frac{\nu}{2}}\left(\sqrt{-2\tau^2\nu t}\right) }[/math]
CF	[math]\displaystyle{ \frac{2}{\Gamma(\frac{\nu}{2})}\left(\frac{-i\tau^2\nu t}{2}\right)^{\!\!\frac{\nu}{4}}\!\!K_{\frac{\nu}{2}}\left(\sqrt{-2i\tau^2\nu t}\right) }[/math]

Short description: Probability distribution

The scaled inverse chi-squared distribution is the distribution for x = 1/s², where s² is a sample mean of the squares of ν independent normal random variables that have mean 0 and inverse variance 1/σ² = τ². The distribution is therefore parametrised by the two quantities ν and τ², referred to as the number of chi-squared degrees of freedom and the scaling parameter, respectively.

This family of scaled inverse chi-squared distributions is closely related to two other distribution families, those of the inverse-chi-squared distribution and the inverse-gamma distribution. Compared to the inverse-chi-squared distribution, the scaled distribution has an extra parameter τ², which scales the distribution horizontally and vertically, representing the inverse-variance of the original underlying process. Also, the scaled inverse chi-squared distribution is presented as the distribution for the inverse of the mean of ν squared deviates, rather than the inverse of their sum. The two distributions thus have the relation that if

[math]\displaystyle{ X \sim \mbox{Scale-inv-}\chi^2(\nu, \tau^2) }[/math] then [math]\displaystyle{ \frac{X}{\tau^2 \nu} \sim \mbox{inv-}\chi^2(\nu) }[/math]

Compared to the inverse gamma distribution, the scaled inverse chi-squared distribution describes the same data distribution, but using a different parametrization, which may be more convenient in some circumstances. Specifically, if

[math]\displaystyle{ X \sim \mbox{Scale-inv-}\chi^2(\nu, \tau^2) }[/math] then [math]\displaystyle{ X \sim \textrm{Inv-Gamma}\left(\frac{\nu}{2}, \frac{\nu\tau^2}{2}\right) }[/math]

Either form may be used to represent the maximum entropy distribution for a fixed first inverse moment [math]\displaystyle{ (E(1/X)) }[/math] and first logarithmic moment [math]\displaystyle{ (E(\ln(X)) }[/math].

The scaled inverse chi-squared distribution also has a particular use in Bayesian statistics, somewhat unrelated to its use as a predictive distribution for x = 1/s². Specifically, the scaled inverse chi-squared distribution can be used as a conjugate prior for the variance parameter of a normal distribution. In this context the scaling parameter is denoted by σ₀² rather than by τ², and has a different interpretation. The application has been more usually presented using the inverse-gamma distribution formulation instead; however, some authors, following in particular Gelman et al. (1995/2004) argue that the inverse chi-squared parametrisation is more intuitive.

Characterization

The probability density function of the scaled inverse chi-squared distribution extends over the domain [math]\displaystyle{ x\gt 0 }[/math] and is

[math]\displaystyle{ f(x; \nu, \tau^2)= \frac{(\tau^2\nu/2)^{\nu/2}}{\Gamma(\nu/2)}~ \frac{\exp\left[ \frac{-\nu \tau^2}{2 x}\right]}{x^{1+\nu/2}} }[/math]

where [math]\displaystyle{ \nu }[/math] is the degrees of freedom parameter and [math]\displaystyle{ \tau^2 }[/math] is the scale parameter. The cumulative distribution function is

[math]\displaystyle{ F(x; \nu, \tau^2)= \Gamma\left(\frac{\nu}{2},\frac{\tau^2\nu}{2x}\right) \left/\Gamma\left(\frac{\nu}{2}\right)\right. }[/math]

[math]\displaystyle{ =Q\left(\frac{\nu}{2},\frac{\tau^2\nu}{2x}\right) }[/math]

where [math]\displaystyle{ \Gamma(a,x) }[/math] is the incomplete gamma function, [math]\displaystyle{ \Gamma(x) }[/math] is the gamma function and [math]\displaystyle{ Q(a,x) }[/math] is a regularized gamma function. The characteristic function is

[math]\displaystyle{ \varphi(t;\nu,\tau^2)= }[/math]

[math]\displaystyle{ \frac{2}{\Gamma(\frac{\nu}{2})}\left(\frac{-i\tau^2\nu t}{2}\right)^{\!\!\frac{\nu}{4}}\!\!K_{\frac{\nu}{2}}\left(\sqrt{-2i\tau^2\nu t}\right) , }[/math]

where [math]\displaystyle{ K_{\frac{\nu}{2}}(z) }[/math] is the modified Bessel function of the second kind.

Parameter estimation

The maximum likelihood estimate of [math]\displaystyle{ \tau^2 }[/math] is

[math]\displaystyle{ \tau^2 = n/\sum_{i=1}^n \frac{1}{x_i}. }[/math]

The maximum likelihood estimate of [math]\displaystyle{ \frac{\nu}{2} }[/math] can be found using Newton's method on:

[math]\displaystyle{ \ln\left(\frac{\nu}{2}\right) - \psi\left(\frac{\nu}{2}\right) = \frac{1}{n} \sum_{i=1}^n \ln\left(x_i\right) - \ln\left(\tau^2\right) , }[/math]

where [math]\displaystyle{ \psi(x) }[/math] is the digamma function. An initial estimate can be found by taking the formula for mean and solving it for [math]\displaystyle{ \nu. }[/math] Let [math]\displaystyle{ \bar{x} = \frac{1}{n}\sum_{i=1}^n x_i }[/math] be the sample mean. Then an initial estimate for [math]\displaystyle{ \nu }[/math] is given by:

[math]\displaystyle{ \frac{\nu}{2} = \frac{\bar{x}}{\bar{x} - \tau^2}. }[/math]

Bayesian estimation of the variance of a normal distribution

The scaled inverse chi-squared distribution has a second important application, in the Bayesian estimation of the variance of a Normal distribution.

According to Bayes' theorem, the posterior probability distribution for quantities of interest is proportional to the product of a prior distribution for the quantities and a likelihood function:

[math]\displaystyle{ p(\sigma^2|D,I) \propto p(\sigma^2|I) \; p(D|\sigma^2) }[/math]

where D represents the data and I represents any initial information about σ² that we may already have.

The simplest scenario arises if the mean μ is already known; or, alternatively, if it is the conditional distribution of σ² that is sought, for a particular assumed value of μ.

Then the likelihood term L(σ²|D) = p(D|σ²) has the familiar form

[math]\displaystyle{ \mathcal{L}(\sigma^2|D,\mu) = \frac{1}{\left(\sqrt{2\pi}\sigma\right)^n} \; \exp \left[ -\frac{\sum_i^n(x_i-\mu)^2}{2\sigma^2} \right] }[/math]

Combining this with the rescaling-invariant prior p(σ²|I) = 1/σ², which can be argued (e.g. following Jeffreys) to be the least informative possible prior for σ² in this problem, gives a combined posterior probability

[math]\displaystyle{ p(\sigma^2|D, I, \mu) \propto \frac{1}{\sigma^{n+2}} \; \exp \left[ -\frac{\sum_i^n(x_i-\mu)^2}{2\sigma^2} \right] }[/math]

This form can be recognised as that of a scaled inverse chi-squared distribution, with parameters ν = n and τ² = s² = (1/n) Σ (x_i-μ)²

Gelman et al remark that the re-appearance of this distribution, previously seen in a sampling context, may seem remarkable; but given the choice of prior the "result is not surprising".^[1]

In particular, the choice of a rescaling-invariant prior for σ² has the result that the probability for the ratio of σ² / s² has the same form (independent of the conditioning variable) when conditioned on s² as when conditioned on σ²:

[math]\displaystyle{ p(\tfrac{\sigma^2}{s^2}|s^2) = p(\tfrac{\sigma^2}{s^2}|\sigma^2) }[/math]

In the sampling-theory case, conditioned on σ², the probability distribution for (1/s²) is a scaled inverse chi-squared distribution; and so the probability distribution for σ² conditioned on s², given a scale-agnostic prior, is also a scaled inverse chi-squared distribution.

Use as an informative prior

If more is known about the possible values of σ², a distribution from the scaled inverse chi-squared family, such as Scale-inv-χ²(n₀, s₀²) can be a convenient form to represent a more informative prior for σ², as if from the result of n₀ previous observations (though n₀ need not necessarily be a whole number):

[math]\displaystyle{ p(\sigma^2|I^\prime, \mu) \propto \frac{1}{\sigma^{n_0+2}} \; \exp \left[ -\frac{n_0 s_0^2}{2\sigma^2} \right] }[/math]

Such a prior would lead to the posterior distribution

[math]\displaystyle{ p(\sigma^2|D, I^\prime, \mu) \propto \frac{1}{\sigma^{n+n_0+2}} \; \exp \left[ -\frac{ns^2 + n_0 s_0^2}{2\sigma^2} \right] }[/math]

which is itself a scaled inverse chi-squared distribution. The scaled inverse chi-squared distributions are thus a convenient conjugate prior family for σ² estimation.

Estimation of variance when mean is unknown

If the mean is not known, the most uninformative prior that can be taken for it is arguably the translation-invariant prior p(μ|I) ∝ const., which gives the following joint posterior distribution for μ and σ²,

[math]\displaystyle{ \begin{align} p(\mu, \sigma^2 \mid D, I) & \propto \frac{1}{\sigma^{n+2}} \exp \left[ -\frac{\sum_i^n(x_i-\mu)^2}{2\sigma^2} \right] \\ & = \frac{1}{\sigma^{n+2}} \exp \left[ -\frac{\sum_i^n(x_i-\bar{x})^2}{2\sigma^2} \right] \exp \left[ -\frac{n(\mu -\bar{x})^2}{2\sigma^2} \right] \end{align} }[/math]

The marginal posterior distribution for σ² is obtained from the joint posterior distribution by integrating out over μ,

[math]\displaystyle{ \begin{align} p(\sigma^2|D, I) \; \propto \; & \frac{1}{\sigma^{n+2}} \; \exp \left[ -\frac{\sum_i^n(x_i-\bar{x})^2}{2\sigma^2} \right] \; \int_{-\infty}^{\infty} \exp \left[ -\frac{n(\mu -\bar{x})^2}{2\sigma^2} \right] d\mu\\ = \; & \frac{1}{\sigma^{n+2}} \; \exp \left[ -\frac{\sum_i^n(x_i-\bar{x})^2}{2\sigma^2} \right] \; \sqrt{2 \pi \sigma^2 / n} \\ \propto \; & (\sigma^2)^{-(n+1)/2} \; \exp \left[ -\frac{(n-1)s^2}{2\sigma^2} \right] \end{align} }[/math]

This is again a scaled inverse chi-squared distribution, with parameters [math]\displaystyle{ \scriptstyle{n-1}\; }[/math] and [math]\displaystyle{ \scriptstyle{s^2 = \sum (x_i - \bar{x})^2/(n-1)} }[/math].

Related distributions

If [math]\displaystyle{ X \sim \mbox{Scale-inv-}\chi^2(\nu, \tau^2) }[/math] then [math]\displaystyle{ k X \sim \mbox{Scale-inv-}\chi^2(\nu, k \tau^2)\, }[/math]
If [math]\displaystyle{ X \sim \mbox{inv-}\chi^2(\nu) \, }[/math] (Inverse-chi-squared distribution) then [math]\displaystyle{ X \sim \mbox{Scale-inv-}\chi^2(\nu, 1/\nu) \, }[/math]
If [math]\displaystyle{ X \sim \mbox{Scale-inv-}\chi^2(\nu, \tau^2) }[/math] then [math]\displaystyle{ \frac{X}{\tau^2 \nu} \sim \mbox{inv-}\chi^2(\nu) \, }[/math] (Inverse-chi-squared distribution)
If [math]\displaystyle{ X \sim \mbox{Scale-inv-}\chi^2(\nu, \tau^2) }[/math] then [math]\displaystyle{ X \sim \textrm{Inv-Gamma}\left(\frac{\nu}{2}, \frac{\nu\tau^2}{2}\right) }[/math] (Inverse-gamma distribution)
Scaled inverse chi square distribution is a special case of type 5 Pearson distribution

References

Gelman A. et al (1995), Bayesian Data Analysis, pp 474–475; also pp 47, 480

↑ Gelman et al (1995), Bayesian Data Analysis (1st ed), p.68

0.00

(0 votes)

Original source: https://en.wikipedia.org/wiki/Scaled inverse chi-squared distribution. Read more

[1] Gelman et al (1995), Bayesian Data Analysis (1st ed), p.68

[1]

Anonymous

Search

Scaled inverse chi-squared distribution

Namespaces

More

Page actions

Contents

Characterization

Parameter estimation

Bayesian estimation of the variance of a normal distribution

Use as an informative prior

Estimation of variance when mean is unknown

Related distributions

References

Navigation

Navigation

Help

Translate

Wiki tools

Wiki tools

Anonymous

Search

Scaled inverse chi-squared distribution

Characterization

Parameter estimation

Bayesian estimation of the variance of a normal distribution

Use as an informative prior

Estimation of variance when mean is unknown

Related distributions

References

Navigation

Wiki tools

Page tools

Other projects

Categories