Bennett, Alpert and Goldstein's S

From HandWiki

Bennett, Alpert & Goldstein’s S is a statistical measure of inter-rater agreement. It was created by Bennett et al. in 1954.[1]

Rationale for use

Bennett et al. suggested adjusting inter-rater reliability to accommodate the percentage of rater agreement that might be expected by chance was a better measure than simple agreement between raters.[2] They proposed an index which adjusted the proportion of rater agreement based on the number of categories employed.

Mathematical formulation

The formula for S is

[math]\displaystyle{ S = \frac{ Q P_a - 1 } { Q - 1 } }[/math]

where Q is the number of categories and Pa is the proportion of agreement between raters.

The variance of S is

[math]\displaystyle{ \operatorname{Var}(S) = \left( \frac { Q } { Q - 1 } \right)^2 \frac { P_a ( P_a - 1 ) } { n - 1 } }[/math]

Notes

This statistic is also known as Guilford’s G.[3] Guilford was the first person to use the approach extensively in the determination of inter-rater reliability.[citation needed]

References

  1. Bennett, EM; Alpert, R; Goldstein, AC (1954). "Communications through limited response questioning". Public Opinion Quarterly 18 (3): 303–308. doi:10.1086/266520. 
  2. Warrens, Matthijs J. (May 2012). "The effect of combining categories on Bennett, Alpert and Goldstein's". Statistical Methodology 9 (3): 341–352. doi:10.1016/j.stamet.2011.09.001. 
  3. Holley, JW; Guilford, JP (1964). "A note on the G index of agreement". Educ Psych Measurement 24 (4): 749–753. doi:10.1177/001316446402400402.