Testing in binary response index models

From HandWiki

Denote a binary response index model as: [math]\displaystyle{ P[Y_i = 1 \mid X_i ] = G (X_i \beta) }[/math], [math]\displaystyle{ [Y_i = 0\mid X_i ] = 1-G (X_i' \beta) }[/math] where [math]\displaystyle{ X_i \in R^N }[/math].

Description

This type of model is applied in many economic contexts, especially in modelling the choice-making behavior. For instance, [math]\displaystyle{ Y_i }[/math] here denotes whether consumer [math]\displaystyle{ i }[/math] chooses to purchase a certain kind of chocolate, and [math]\displaystyle{ X_i }[/math] includes many variables characterizing the features of consumer [math]\displaystyle{ i }[/math] . Through function [math]\displaystyle{ G(\cdot) }[/math] , the probability of choosing to purchase is determined.[1]

Now, suppose its maximum likelihood estimator (MLE) [math]\displaystyle{ \hat {\beta}_{u} }[/math] has an asymptotic distribution as [math]\displaystyle{ \sqrt {n} ( \hat {\beta}_{u} - \beta) \xrightarrow{d} N (0, V) }[/math] and there is a feasible consistent estimator for the asymptotic variance [math]\displaystyle{ V }[/math] denoted as [math]\displaystyle{ \hat {V} }[/math] . Usually, there are two different types of hypothesis needed to be tested in binary response index model.

The first type is testing the multiple exclusion restrictions, namely, testing [math]\displaystyle{ \beta_2 = 0 with =[\beta_1; \beta_2] where \beta _ 2 \in R^Q }[/math]. If the unrestricted MLE can be easily computed, it is convenient to use the Wald test[2] whose test statistic is constructed as:

[math]\displaystyle{ (D \hat {\beta}_{u})^{T} (D\hat {V}D^{T}/n)^{-1} (D\hat {\beta}_{u}) \xrightarrow{d} X_{Q}^{2} }[/math]

Where D is a diagonal matrix with the last Q diagonal entries as 0 and others as 1. If the restricted MLE can be easily computed, it is more convenient to use the Score test (LM test). Denote the maximum likelihood estimator under the restricted model as [math]\displaystyle{ (\hat {\beta}_{r}) }[/math] and define [math]\displaystyle{ \hat {u}_{i} \equiv Y_i - G(X_i' \hat {\beta}_{r}), \hat {G}_{i} \equiv G(X_i' \hat {\beta}_{r}) }[/math] and [math]\displaystyle{ \hat {g}_{i} \equiv g(X_i' \hat {\beta}_{r} }[/math], where [math]\displaystyle{ g(\cdot) = G' (\cdot) }[/math]. Then run the OLS regression [math]\displaystyle{ \frac {\hat {u}_{i}} {\sqrt{(\hat {G}_{i}(1-\hat {G}_{i}) }} }[/math] on [math]\displaystyle{ \frac {\hat {g}_{i}} {\sqrt{(\hat {G}_{i}(1-\hat {G}_{i}) }}X_{1i}', \frac {\hat {g}_{i}} {\sqrt{(\hat {G}_{i}(1-\hat {G}_{i}) }}X_{2i}' }[/math], where [math]\displaystyle{ X_i = [X_{1i} ; X_{2i} ] }[/math] and [math]\displaystyle{ X_{2i}\varepsilon R^Q }[/math]. The LM statistic is equal to the explained sum of squares from this regression [3] and it is asymptotically distributed as [math]\displaystyle{ X_Q^2 }[/math]. If the MLE can be computed easily under both of the restricted and unrestricted models, Likelihood-ratio test is also a choice: let [math]\displaystyle{ L_u }[/math] denote the value of the log-likelihood function under the unrestricted model and let [math]\displaystyle{ L_r }[/math] denote the value under the restricted model, then [math]\displaystyle{ 2(L_u - L_r) }[/math] has an asymptotic [math]\displaystyle{ X_Q^2 }[/math] distribution.

The second type is testing a nonlinear hypothesis about [math]\displaystyle{ \beta }[/math], which can be represented as [math]\displaystyle{ H_0 : c(\beta) = 0 }[/math] where [math]\displaystyle{ c(\beta) }[/math] is a Q×1 vector of possibly nonlinear functions satisfying the differentiability and rank requirements. In most of the cases, it is not easy or even feasible to compute the MLE under the restricted model when [math]\displaystyle{ c(\beta) }[/math] include some complicated nonlinear functions. Hence, Wald test is usually used to deal with this problem. The test statistic is constructed as:

[math]\displaystyle{ c(\hat {\beta}_{u}') [\nabla_\beta c(\hat {\beta}_{u}) \hat {V}\nabla_\beta c(\hat {\beta}_{u})']^{-1} c(\hat {\beta}_{u}) \xrightarrow{d} X_Q^2 }[/math]

where [math]\displaystyle{ \nabla_\beta c(\hat {\beta}_{u}) }[/math] is the Q×N Jacobian of [math]\displaystyle{ c(\beta) }[/math] evaluated at [math]\displaystyle{ \hat {\beta}_{u} }[/math].

For the tests with very general and complicated alternatives, the formula of the test statistics might not have the exactly same representation as above. But we can still derive the formulas as well as its asymptotic distribution by Delta method[4] and implement Wald test, Score test or Likelihood-ratio test.[5] Which test should be used is determined by the relative computation difficulty of the MLE under restricted and unrestricted models.

References

  1. For an application example, refer to: Rayton, B. A. (2006): “Examining the Interconnection of Job Satisfaction and Organizational Commitment: an Application of the Bivariate Probit Model”,The International Journal of Human Resource Management, Vol. 17, Iss. 1.
  2. Greene, W. H. (2003), Econometric Analysis, Prentice Hall, Upper Saddle River, NJ .
  3. Wooldridge, J. (2002): Econometric Analysis of Cross Section and Panel Data, MIT Press, Cambridge, Mass.
  4. Casella, G., and Berger, R. L. (2002). Statistical inference. Duxbury Press.
  5. Engle, Robert F. (1983). "Wald, Likelihood Ratio, and Lagrange Multiplier Tests in Econometrics". In Intriligator, M. D.; and Griliches, Z. Handbook of Econometrics II. Elsevier. pp. 796–801. ISBN:978-0-444-86185-6.