Proportionate reduction of error

From HandWiki
Revision as of 04:22, 16 June 2021 by imported>Scavis2 (url)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Proportionate reduction of error (PRE) is the gain in precision of predicting dependent variable [math]\displaystyle{ y }[/math] from knowing the independent variable [math]\displaystyle{ x }[/math] (or a collection of multiple variables). It is a goodness of fit measure of statistical models, and forms the mathematical basis for several correlation coefficients.[1] The summary statistics is particularly useful and popular when used to evaluate models where the dependent variable is binary, taking on values {0,1}.

Example

If both [math]\displaystyle{ x }[/math] and [math]\displaystyle{ y }[/math] vectors have cardinal (interval or rational) scale, then without knowing [math]\displaystyle{ x }[/math], the best predictor for an unknown [math]\displaystyle{ y }[/math] would be [math]\displaystyle{ \bar{y} }[/math], the arithmetic mean of the [math]\displaystyle{ y }[/math]-data. The total prediction error would be [math]\displaystyle{ E_1 = \sum_{i=1}^n{(y_i - \bar{y})^2} }[/math] .

If, however, [math]\displaystyle{ x }[/math] and a function relating [math]\displaystyle{ y }[/math] to [math]\displaystyle{ x }[/math] are known, for example a straight line [math]\displaystyle{ \hat{y}_i = a + b x_i }[/math], then the prediction error becomes [math]\displaystyle{ E_2 = \sum_{i=1}^n{(y_i - \hat{y})^2} }[/math]. The coefficient of determination then becomes [math]\displaystyle{ r^2 = \frac{E_1 - E_2}{E_1} = 1 - \frac{E_2}{E_1} }[/math] and is the fraction of variance of [math]\displaystyle{ y }[/math] that is explained by [math]\displaystyle{ x }[/math]. Its square root is Pearson's product-moment correlation [math]\displaystyle{ r }[/math].

There are several other correlation coefficients that have PRE interpretation and are used for variables of different scales:

predict from coefficient symmetric
nominal, binary nominal, binary Guttman's λ[2] yes
ordinal nominal Freeman's θ[3] yes
cardinal nominal η[math]\displaystyle{ ^2 }[/math][4] no
ordinal binary, ordinal Wilson's e [5] yes
cardinal binary point biserial correlation yes

References

  1. Freeman, L.C.: Elementary applied statistics, New, York, London, Sidney (John Wiley and Sons) 1965
  2. Guttman, L. The quantification of a class of attributes: A theory and method of scale construction. In: The prediction of personal adjustment. Horst, P.; Wallin, P.; Guttman, L. et al. (eds.) New York (Social Science Research Council) 1941, pp. 319–348.
  3. Freeman, L.C.: Elementary applied statistics, New, York, London, Sidney (John Wiley and Sons) 1965
  4. anonymous. Fehlerreduktionsmaße [web-site, accessed 2017-07-29]. 2016. Available from: https://de.wikipedia.org/wiki/Fehlerreduktionsma%C3%9Fe#.CE.B72.
  5. Freeman, L.C.: Order-based statistics and monotonicity: A family of ordinal measures of association. J. Math. Sociol. 1986, vol. 12, no. 1, pp. 49–69. Available from: http://moreno.ss.uci.edu/41.pdf.