Ordered probit

From HandWiki

In statistics, ordered probit is a generalization of the widely used probit analysis to the case of more than two outcomes of an ordinal dependent variable (a dependent variable for which the potential values have a natural ordering, as in poor, fair, good, excellent). Similarly, the widely used logit method also has a counterpart ordered logit. Ordered probit, like ordered logit, is a particular method of ordinal regression.

For example, in clinical research, the effect a drug may have on a patient may be modeled with ordered probit regression. Independent variables may include the use or non-use of the drug as well as control variables such as age and details from medical history such as whether the patient suffers from high blood pressure, heart disease, etc. The dependent variable would be ranked from the following list: complete cure, relieve symptoms, no effect, deteriorate condition, death.

Another example application are Likert-type items commonly employed in survey research, where respondents rate their agreement on an ordered scale (e.g., "Strongly disagree" to "Strongly agree"). The ordered probit model provides an appropriate fit to these data, preserving the ordering of response options while making no assumptions of the interval distances between options.[1]

Conceptual underpinnings

Suppose the underlying relationship to be characterized is[2]

[math]\displaystyle{ y^* = \mathbf{x}^{\mathsf{T}} \beta + \epsilon }[/math],

where [math]\displaystyle{ y^* }[/math] is the exact but unobserved dependent variable (perhaps the exact level of improvement by the patient); [math]\displaystyle{ \mathbf{x} }[/math] is the vector of independent variables, and [math]\displaystyle{ \beta }[/math] is the vector of regression coefficients which we wish to estimate. Further suppose that while we cannot observe [math]\displaystyle{ y^* }[/math], we instead can only observe the categories of response:

[math]\displaystyle{ y= \begin{cases} 0~~ \text{if}~~y^* \le 0, \\ 1~~ \text{if}~~0\lt y^* \le \mu_1, \\ 2~~ \text{if}~~\mu_1 \lt y^* \le \mu_2 \\ \vdots \\ N~~ \text{if}~~ \mu_{N-1} \lt y^*. \end{cases} }[/math]

Then the ordered probit technique will use the observations on [math]\displaystyle{ y }[/math], which are a form of censored data on [math]\displaystyle{ y^* }[/math], to fit the parameter vector [math]\displaystyle{ \beta }[/math].

Estimation

The model cannot be consistently estimated using ordinary least squares; it is usually estimated using maximum likelihood. For details on how the equation is estimated, see the article Ordinal regression.

References

  1. Liddell, T; Kruschke, J (2018). "Analyzing ordinal data with metric models: What could possibly go wrong?". Journal of Experimental Social Psychology 79: 328-348. doi:10.1016/j.jesp.2018.08.009. https://scholarworks.iu.edu/dspace/bitstream/2022/21970/1/2018-03-23_wim_kruschke_ordinal-metric_flyer.pdf. 
  2. Greene, William H. (2012). Econometric Analysis (Seventh ed.). Boston: Pearson Education. pp. 827–831. ISBN 978-0-273-75356-8. 

Further reading

  • Becker, William E.; Kennedy, Peter E. (1992). "A Graphical Exposition of the Ordered Probit". Econometric Theory 8 (1): 127–131. doi:10.1017/S0266466600010781.