g-prior

From HandWiki
Revision as of 16:13, 6 February 2024 by Smart bot editor (talk | contribs) (url)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Short description: Type of probability distribution used in statistics

In statistics, the g-prior is an objective prior for the regression coefficients of a multiple regression. It was introduced by Arnold Zellner.[1] It is a key tool in Bayes and empirical Bayes variable selection.[2][3]

Definition

Consider a data set [math]\displaystyle{ (x_1,y_1),\ldots,(x_n,y_n) }[/math], where the [math]\displaystyle{ x_i }[/math] are Euclidean vectors and the [math]\displaystyle{ y_i }[/math] are scalars. The multiple regression model is formulated as

[math]\displaystyle{ y_i = x_i^\top\beta + \varepsilon_i. }[/math]

where the [math]\displaystyle{ \varepsilon_i }[/math] are random errors. Zellner's g-prior for [math]\displaystyle{ \beta }[/math] is a multivariate normal distribution with covariance matrix proportional to the inverse Fisher information matrix for [math]\displaystyle{ \beta }[/math], similar to a Jeffreys prior.

Assume the [math]\displaystyle{ \varepsilon_i }[/math] are i.i.d. normal with zero mean and variance [math]\displaystyle{ \psi^{-1} }[/math]. Let [math]\displaystyle{ X }[/math] be the matrix with [math]\displaystyle{ i }[/math]th row equal to [math]\displaystyle{ x_i^\top }[/math]. Then the g-prior for [math]\displaystyle{ \beta }[/math] is the multivariate normal distribution with prior mean a hyperparameter [math]\displaystyle{ \beta_0 }[/math] and covariance matrix proportional to [math]\displaystyle{ \psi^{-1}(X^\top X)^{-1} }[/math], i.e.,

[math]\displaystyle{ \beta |\psi \sim \text{N}[\beta_0,g\psi^{-1} (X^\top X)^{-1}]. }[/math]

where g is a positive scalar parameter.

Posterior distribution of beta

The posterior distribution of [math]\displaystyle{ \beta }[/math] is given as

[math]\displaystyle{ \beta |\psi,x,y \sim \text{N}\Big[q\hat\beta+(1-q)\beta_0,\frac q\psi(X^\top X)^{-1}\Big]. }[/math]

where [math]\displaystyle{ q=g/(1+g) }[/math] and

[math]\displaystyle{ \hat\beta = (X^\top X)^{-1}X^\top y. }[/math]

is the maximum likelihood (least squares) estimator of [math]\displaystyle{ \beta }[/math]. The vector of regression coefficients [math]\displaystyle{ \beta }[/math] can be estimated by its posterior mean under the g-prior, i.e., as the weighted average of the maximum likelihood estimator and [math]\displaystyle{ \beta_0 }[/math],

[math]\displaystyle{ \tilde\beta = q\hat\beta+(1-q)\beta_0. }[/math]

Clearly, as g →∞, the posterior mean converges to the maximum likelihood estimator.

Selection of g

Estimation of g is slightly less straightforward than estimation of [math]\displaystyle{ \beta }[/math]. A variety of methods have been proposed, including Bayes and empirical Bayes estimators.[3]

References

  1. Zellner, A. (1986). "On Assessing Prior Distributions and Bayesian Regression Analysis with g Prior Distributions". in Goel, P.; Zellner, A.. Bayesian Inference and Decision Techniques: Essays in Honor of Bruno de Finetti. Studies in Bayesian Econometrics and Statistics. 6. New York: Elsevier. pp. 233–243. ISBN 978-0-444-87712-3. 
  2. George, E.; Foster, D. P. (2000). "Calibration and empirical Bayes variable selection". Biometrika 87 (4): 731–747. doi:10.1093/biomet/87.4.731. 
  3. 3.0 3.1 Liang, F.; Paulo, R.; Molina, G.; Clyde, M. A.; Berger, J. O. (2008). "Mixtures of g priors for Bayesian variable selection". Journal of the American Statistical Association 103 (481): 410–423. doi:10.1198/016214507000001337.