Information matrix test

From HandWiki

In econometrics, the information matrix test is used to determine whether a regression model is misspecified. The test was developed by Halbert White,[1] who observed that in a correctly specified model and under standard regularity assumptions, the Fisher information matrix can be expressed in either of two ways: as the outer product of the gradient, or as a function of the Hessian matrix of the log-likelihood function. Consider a linear model [math]\displaystyle{ \mathbf{y} = \mathbf{X} \mathbf{\beta} + \mathbf{u} }[/math], where the errors [math]\displaystyle{ \mathbf{u} }[/math] are assumed to be distributed [math]\displaystyle{ \mathrm{N}(0, \sigma^2 \mathbf{I}) }[/math]. If the parameters [math]\displaystyle{ \beta }[/math] and [math]\displaystyle{ \sigma^2 }[/math] are stacked in the vector [math]\displaystyle{ \mathbf{\theta}^{\mathsf{T}} = \begin{bmatrix} \beta & \sigma^2 \end{bmatrix} }[/math], the resulting log-likelihood function is

[math]\displaystyle{ \ell (\mathbf{\theta}) = - \frac{n}{2} \log \sigma^2 - \frac{1}{2 \sigma^2} \left( \mathbf{y} - \mathbf{X} \mathbf{\beta} \right)^{\mathsf{T}} \left( \mathbf{y} - \mathbf{X} \mathbf{\beta} \right) }[/math]

The information matrix can then be expressed as

[math]\displaystyle{ \mathbf{I} (\mathbf{\theta}) = \operatorname{E} \left[ \left( \frac{\partial \ell (\mathbf{\theta}) }{ \partial \mathbf{\theta} } \right) \left( \frac{\partial \ell (\mathbf{\theta}) }{ \partial \mathbf{\theta} } \right)^{\mathsf{T}} \right] }[/math]

that is the expected value of the outer product of the gradient or score. Second, it can be written as the negative of the Hessian matrix of the log-likelihood function

[math]\displaystyle{ \mathbf{I} (\mathbf{\theta}) = - \operatorname{E} \left[ \frac{\partial^2 \ell (\mathbf{\theta}) }{ \partial \mathbf{\theta} \, \partial \mathbf{\theta}^{\mathsf{T}}} \right] }[/math]

If the model is correctly specified, both expressions should be equal. Combining the equivalent forms yields

[math]\displaystyle{ \mathbf{\Delta}(\mathbf{\theta}) = \sum_{i=1}^n \left[ \frac{\partial^2 \ell(\mathbf{\theta}) }{ \partial \mathbf{\theta} \, \partial \mathbf{\theta}^{\mathsf{T}} } + \frac{\partial \ell(\mathbf{\theta}) }{ \partial \mathbf{\theta} } \frac{\partial \ell (\mathbf{\theta}) }{ \partial \mathbf{\theta} } \right] }[/math]

where [math]\displaystyle{ \mathbf{\Delta} (\mathbf{\theta}) }[/math] is an [math]\displaystyle{ (r \times r) }[/math] random matrix, where [math]\displaystyle{ r }[/math] is the number of parameters. White showed that the elements of [math]\displaystyle{ n^{-1/2} \mathbf{\Delta} ( \mathbf{\hat{\theta}} ) }[/math], where [math]\displaystyle{ \mathbf{\hat{\theta}} }[/math] is the MLE, are asymptotically normally distributed with zero means when the model is correctly specified.[2] In small samples, however, the test generally performs poorly.[3]

References

  1. White, Halbert (1982). "Maximum Likelihood Estimation of Misspecified Models". Econometrica 50 (1): 1–25. doi:10.2307/1912526. 
  2. Godfrey, L. G. (1988). Misspecification Tests in Econometrics. Cambridge University Press. pp. 35–37. ISBN 0-521-26616-5. https://books.google.com/books?id=apXgcgoy7OgC&pg=PA35. 
  3. Orme, Chris (1990). "The Small-Sample Performance of the Information-Matrix Test". Journal of Econometrics 46 (3): 309–331. doi:10.1016/0304-4076(90)90012-I. 

Further reading