Yates's correction for continuity

In statistics, Yates's correction for continuity (or Yates's chi-squared test) is used in certain situations when testing for independence in a contingency table. It aims at correcting the error introduced by assuming that the discrete probabilities of frequencies in the table can be approximated by a continuous distribution (chi-squared). In some cases, Yates's correction may adjust too far, and so its current use is limited.

Correction for approximation error

Using the chi-squared distribution to interpret Pearson's chi-squared statistic requires one to assume that the discrete probability of observed binomial frequencies in the table can be approximated by the continuous chi-squared distribution. This assumption is not quite correct, and introduces some error.

To reduce the error in approximation, Frank Yates, an England statistician, suggested a correction for continuity that adjusts the formula for Pearson's chi-squared test by subtracting 0.5 from the difference between each observed value and its expected value in a 2 × 2 contingency table.^[1] This reduces the chi-squared value obtained and thus increases its p-value.

The effect of Yates's correction is to prevent overestimation of statistical significance for small data. This formula is chiefly used when at least one cell of the table has an expected count smaller than 5. Unfortunately, Yates's correction may tend to overcorrect. This can result in an overly conservative result that fails to reject the null hypothesis when it should (a type II error). So it is suggested that Yates's correction is unnecessary even with quite low sample sizes,^[2] such as:

[math]\displaystyle{ \sum_{i=1}^N O_i = 20 \, }[/math]

The following is Yates's corrected version of Pearson's chi-squared statistics:

[math]\displaystyle{ \chi_\text{Yates}^2 = \sum_{i=1}^{N} {(|O_i - E_i| - 0.5)^2 \over E_i} }[/math]

where:

O_i = an observed frequency

E_i = an expected (theoretical) frequency, asserted by the null hypothesis

N = number of distinct events

2 × 2 table

As a short-cut, for a 2 × 2 table with the following entries:

	S	F
A	a	b	a+b
B	c	d	c+d
	a+c	b+d	N

[math]\displaystyle{ \chi_\text{Yates}^2 = \frac{N(|ad - bc| - N/2)^2}{(a+b) (c+d) (a+c) (b+d)}. }[/math]

In some cases, this is better.

[math]\displaystyle{ \chi_\text{Yates}^2 = \frac{N( \max(0, |ad - bc| - N/2) )^2}{N_S N_F N_A N_B}. }[/math]

References

↑ Yates, F (1934). "Contingency table involving small numbers and the χ² test". Supplement to the Journal of the Royal Statistical Society 1(2): 217–235. JSTOR 2983604
↑ Sokal RR, Rohlf F.J. (1981). Biometry: The Principles and Practice of Statistics in Biological Research. Oxford: W.H. Freeman, ISBN 0-7167-1254-7.

0.00

(0 votes)

Original source: https://en.wikipedia.org/wiki/Yates's correction for continuity. Read more

[Yates-1] Yates, F (1934). "Contingency table involving small numbers and the χ² test". Supplement to the Journal of the Royal Statistical Society 1(2): 217–235. JSTOR 2983604

[Sokal1981-2] Sokal RR, Rohlf F.J. (1981). Biometry: The Principles and Practice of Statistics in Biological Research. Oxford: W.H. Freeman, ISBN 0-7167-1254-7.

[1]

[2]

Anonymous

Search

Yates's correction for continuity

Namespaces

More

Page actions

Contents

Correction for approximation error

2 × 2 table

See also

References

Navigation

Navigation

Help

Translate

Wiki tools

Wiki tools

Anonymous

Search

Yates's correction for continuity

Correction for approximation error

2 × 2 table

See also

References

Navigation

Wiki tools

Page tools

Other projects

Categories