Separation (statistics)

From HandWiki

In statistics, separation is a phenomenon associated with models for dichotomous or categorical outcomes, including logistic and probit regression. Separation occurs if the predictor (or a linear combination of some subset of the predictors) is associated with only one outcome value when the predictor range is split at a certain value.

The phenomenon

For example, if the predictor X is continuous, and the outcome y = 1 for all observed x > 2. If the outcome values are (seemingly) perfectly determined by the predictor (e.g., y = 0 when x ≤ 2) then the condition "complete separation" is said to occur. If instead there is some overlap (e.g., y = 0 when x < 2, but y has observed values of 0 and 1 when x = 2) then "quasi-complete separation" occurs. A 2 × 2 table with an empty (zero) cell is an example of quasi-complete separation.

The problem

This observed form of the data is important because it sometimes causes problems with the estimation of regression coefficients. For example, maximum likelihood (ML) estimation relies on maximization of the likelihood function, where e.g. in case of a logistic regression with completely separated data the maximum appears at the parameter space's margin, leading to "infinite" estimates, and, along with that, to problems with providing sensible standard errors.[1][2] Statistical software will often output an arbitrarily large parameter estimate with a very large standard error.[3]

Possible remedies

An approach to "fix" problems with ML estimation is the use of regularization (or "continuity corrections").[4][5] In particular, in case of a logistic regression problem, the use of exact logistic regression or Firth logistic regression, a bias-reduction method based on a penalized likelihood, may be an option.[6]

Alternatively, one may avoid the problems associated with likelihood maximization by switching to a Bayesian approach to inference. Within a Bayesian framework, the pathologies arising from likelihood maximization are avoided by the use of integration rather than maximization, as well as by the use of sensible prior probability distributions.[7]

References

  1. Zeng, Guoping; Zeng, Emily (2019). "On the Relationship between Multicollinearity and Separation in Logistic Regression". Communications in Statistics. Simulation and Computation 50 (7): 1989–1997. doi:10.1080/03610918.2019.1589511. 
  2. Albert, A.; Anderson, J. A. (1984). "On the Existence of Maximum Likelihood Estimates in Logistic Regression Models". Biometrika 71 (1–10): 1–10. doi:10.1093/biomet/71.1.1. 
  3. McCullough, B. D.; Vinod, H. D. (2003). "Verifying the Solution from a Nonlinear Solver: A Case Study". American Economic Review 93 (3): 873–892. doi:10.1257/000282803322157133. 
  4. Cole, S.R.; Chu, H.; Greenland, S. (2014), "Maximum likelihood, profile likelihood, and penalized likelihood: A primer", American Journal of Epidemiology 179 (2): 252–260, doi:10.1093/aje/kwt245, PMID 24173548 
  5. Sweeting, M.J.; Sutton, A.J.; Lambert, P.C. (2004), "What to add to nothing? Use and avoidance of continuity corrections in meta-analysis of sparse data", Statistics in Medicine 23 (9): 1351–1375, doi:10.1002/sim.1761, PMID 15116347 
  6. Mansournia, Mohammad Ali; Geroldinger, Angelika; Greenland, Sander; Heinze, Georg (2018). "Separation in Logistic Regression: Causes, Consequences, and Control". American Journal of Epidemiology 187 (4): 864–870. doi:10.1093/aje/kwx299. PMID 29020135. 
  7. Gelman, A.; Jakulin, A.; Pittau, M.G.; Su, Y. (2008), "A weakly informative default prior distribution dor logistic and other regression models", Annals of Applied Statistics 2 (4): 1360–1383, doi:10.1214/08-AOAS191 

Further reading

  • Albert, A.; Anderson, J. A. (1984), "On the existence of maximum likelihood estimates in logistic regression models", Biometrika 71 (1): 1–10, doi:10.1093/biomet/71.1.1 
  • Kosmidis, I.; Firth, D. (2021), "Jeffreys-prior penalty, finiteness and shrinkage in binomial-response generalized linear models", Biometrika 108 (1): 71–82, doi:10.1093/biomet/asaa052 
  • Davidson, Russell; MacKinnon, James G. (2004). Econometric Theory and Methods. New York: Oxford University Press. pp. 458–459. ISBN 978-0-19-512372-2. 

External links