Standardized mean of a contrast variable

From HandWiki

In statistics, the standardized mean of a contrast variable (SMCV or SMC), is a parameter assessing effect size. The SMCV is defined as mean divided by the standard deviation of a contrast variable.[1][2] The SMCV was first proposed for one-way ANOVA cases [2] and was then extended to multi-factor ANOVA cases .[3]

Background

Consistent interpretations for the strength of group comparison, as represented by a contrast, are important.[4][5]

When there are only two groups involved in a comparison, SMCV is the same as the strictly standardized mean difference (SSMD). SSMD belongs to a popular type of effect-size measure called "standardized mean differences"[6] which includes Cohen's [math]\displaystyle{ d }[/math][7] and Glass's [math]\displaystyle{ \delta. }[/math][8]

In ANOVA, a similar parameter for measuring the strength of group comparison is standardized effect size (SES).[9] One issue with SES is that its values are incomparable for contrasts with different coefficients. SMCV does not have such an issue.

Concept

Suppose the random values in t groups represented by random variables [math]\displaystyle{ G_1, G_2, \ldots, G_t }[/math] have means [math]\displaystyle{ \mu_1, \mu_2, \ldots, \mu_t }[/math] and variances [math]\displaystyle{ \sigma_1^2, \sigma_2^2, \ldots, \sigma_t^2 }[/math], respectively. A contrast variable [math]\displaystyle{ V }[/math] is defined by

[math]\displaystyle{ V=\sum_{i=1}^t c_i G_i , }[/math]

where the [math]\displaystyle{ c_i }[/math]'s are a set of coefficients representing a comparison of interest and satisfy [math]\displaystyle{ \sum_{i=1}^t c_i = 0 }[/math]. The SMCV of contrast variable [math]\displaystyle{ V }[/math], denoted by [math]\displaystyle{ \lambda }[/math], is defined as[1]

[math]\displaystyle{ \lambda = \frac{\operatorname{E}(V)}{\operatorname{stdev}(V)} = \frac{\sum_{i=1}^t c_i \mu_i}{\sqrt{\text{Var}\left(\sum_{i=1}^t c_i G_i\right)}} = \frac{\sum_{i=1}^t c_i \mu_i}{\sqrt{\sum_{i=1}^t c_i^2 \sigma_i^2 + 2\sum_{i=1}^t \sum_{j=i} c_i c_j \sigma_{ij} }} }[/math]

where [math]\displaystyle{ \sigma_{ij} }[/math] is the covariance of [math]\displaystyle{ G_{i} }[/math] and [math]\displaystyle{ G_{j} }[/math]. When [math]\displaystyle{ G_1, G_2, \ldots, G_t }[/math] are independent,

[math]\displaystyle{ \lambda = \frac{\sum_{i=1}^t c_i \mu_i}{\sqrt{\sum_{i=1}^t c_i^2 \sigma_i^2 }}. }[/math]

Classifying rule for the strength of group comparisons

The population value (denoted by [math]\displaystyle{ \lambda }[/math] ) of SMCV can be used to classify the strength of a comparison represented by a contrast variable, as shown in the following table.[1][2] This classifying rule has a probabilistic basis due to the link between SMCV and c+-probability.[1]

Effect type Effect subtype Thresholds for negative SMCV Thresholds for positive SMCV
Extra large Extremely strong [math]\displaystyle{ \lambda \le -5 }[/math] [math]\displaystyle{ \lambda \ge 5 }[/math]
Very strong [math]\displaystyle{ -5 \lt \lambda \le -3 }[/math] [math]\displaystyle{ 5 \gt \lambda \ge 3 }[/math]
Strong [math]\displaystyle{ -3 \lt \lambda \le -2 }[/math] [math]\displaystyle{ 3 \gt \lambda \ge 2 }[/math]
Fairly strong [math]\displaystyle{ -2 \lt \lambda \le -1.645 }[/math] [math]\displaystyle{ 2 \gt \lambda \ge 1.645 }[/math]
Large Moderate [math]\displaystyle{ -1.645 \lt \lambda \le -1.28 }[/math] [math]\displaystyle{ 1.645 \gt \lambda \ge 1.28 }[/math]
Fairly moderate [math]\displaystyle{ -1.28 \lt \lambda \le -1 }[/math] [math]\displaystyle{ 1.28 \gt \lambda \ge 1 }[/math]
Medium Fairly weak [math]\displaystyle{ -1 \lt \lambda \le -0.75 }[/math] [math]\displaystyle{ 1 \gt \lambda \ge 0.75 }[/math]
Weak [math]\displaystyle{ -0.75 \lt \lambda \lt -0.5 }[/math] [math]\displaystyle{ 0.75 \gt \lambda \gt 0.5 }[/math]
Very weak [math]\displaystyle{ -0.5 \le \lambda \lt -0.25 }[/math] [math]\displaystyle{ 0.5 \ge \lambda \gt 0.25 }[/math]
Small Extremely weak [math]\displaystyle{ -0.25 \le \lambda \lt 0 }[/math] [math]\displaystyle{ 0.25 \ge \lambda \gt 0 }[/math]
No effect [math]\displaystyle{ \lambda = 0 }[/math]

Statistical estimation and inference

The estimation and inference of SMCV presented below is for one-factor experiments.[1][2] Estimation and inference of SMCV for multi-factor experiments has also been discussed.[1][3]

The estimation of SMCV relies on how samples are obtained in a study. When the groups are correlated, it is usually difficult to estimate the covariance among groups. In such a case, a good strategy is to obtain matched or paired samples (or subjects) and to conduct contrast analysis based on the matched samples. A simple example of matched contrast analysis is the analysis of paired difference of drug effects after and before taking a drug in the same patients. By contrast, another strategy is to not match or pair the samples and to conduct contrast analysis based on the unmatched or unpaired samples. A simple example of unmatched contrast analysis is the comparison of efficacy between a new drug taken by some patients and a standard drug taken by other patients. Methods of estimation for SMCV and c+-probability in matched contrast analysis may differ from those used in unmatched contrast analysis.

Unmatched samples

Consider an independent sample of size [math]\displaystyle{ n_i }[/math],

[math]\displaystyle{ Y_i = \left(Y_{i1}, Y_{i2}, \ldots, Y_{i n_i}\right) }[/math]

from the [math]\displaystyle{ i^\text{th} (i=1, 2, \ldots, t) }[/math] group [math]\displaystyle{ G_i }[/math]. [math]\displaystyle{ Y_i }[/math]'s are independent. Let [math]\displaystyle{ \bar{Y}_i = \frac{1}{n_i} \sum_{j=1}^{n_i} Y_{ij} }[/math],

[math]\displaystyle{ s_i^2 = \frac{1}{n_i-1} \sum_{j=1}^{n_i} \left(Y_{ij} - \bar{Y}_i\right)^2, }[/math]
[math]\displaystyle{ N = \sum_{i=1}^t n_i }[/math]

and

[math]\displaystyle{ \text{MSE } = \frac{1}{N-t} \sum_{i=1}^t \left(n_i - 1\right)s_i^2. }[/math]

When the [math]\displaystyle{ t }[/math] groups have unequal variance, the maximal likelihood estimate (MLE) and method-of-moment estimate (MM) of SMCV ([math]\displaystyle{ \lambda }[/math]) are, respectively[1][2]

[math]\displaystyle{ \hat{\lambda}_\text{MLE } = \frac{\sum_{i=1}^t c_i \bar{Y}_i}{\sqrt{\sum_{i=1}^t \frac{n_i - 1}{n_i}c_i^2 s_i^2 }} }[/math]

and

[math]\displaystyle{ \hat{\lambda}_\text{MM} = \frac{\sum_{i=1}^t c_i \bar{Y}_i}{\sqrt{\sum_{i=1}^t c_i^2 s_i^2 }}. }[/math]

When the [math]\displaystyle{ t }[/math] groups have equal variance, under normality assumption, the uniformly minimal variance unbiased estimate (UMVUE) of SMCV ([math]\displaystyle{ \lambda }[/math]) is[1][2]

[math]\displaystyle{ \hat{\lambda}_\text{UMVUE} = \sqrt\frac{K}{N - t} \frac{\sum_{i=1}^t c_i \bar{Y}_i}{\sqrt{\sum_{i=1}^t \text{MSE } c_i^2}} }[/math]

where [math]\displaystyle{ K = \frac{2 \left(\Gamma\left(\frac{N - t}{2}\right)\right)^2}{\left(\Gamma\left(\frac{N - t - 1}{2}\right)\right)^2} }[/math].

The confidence interval of SMCV can be made using the following non-central t-distribution:[1][2]

[math]\displaystyle{ T = \frac{\sum_{i=1}^t c_i \bar{Y}_i}{\sqrt{\sum_{i=1}^t \text{MSE } c_i^2/n_i}} \sim \text{noncentral } t(N-t, b\lambda) }[/math]

where [math]\displaystyle{ b = \sqrt{\frac{\sum_{i=1}^t c_i^2}{\sum_{i=1}^t c_i^2/n_i}}. }[/math]

Matched samples

In matched contrast analysis, assume that there are [math]\displaystyle{ n }[/math] independent samples [math]\displaystyle{ \left(Y_{1j}, Y_{2j}, \cdots, Y_{tj}\right) }[/math] from [math]\displaystyle{ t }[/math] groups ([math]\displaystyle{ G_i }[/math]'s), where [math]\displaystyle{ i = 1, 2, \cdots, t; j = 1, 2, \cdots, n }[/math]. Then the [math]\displaystyle{ j^\text{th} }[/math] observed value of a contrast [math]\displaystyle{ V = \sum_{i=1}^t c_i G_i }[/math] is [math]\displaystyle{ v_j = \sum_{i=1}^t c_i Y_i }[/math].

Let [math]\displaystyle{ \bar{V} }[/math] and [math]\displaystyle{ s_V^2 }[/math] be the sample mean and sample variance of the contrast variable [math]\displaystyle{ V }[/math], respectively. Under normality assumptions, the UMVUE estimate of SMCV is[1]

[math]\displaystyle{ \hat{\lambda}_\text{UMVUE} = \sqrt\frac{K}{n - 1}\frac{\bar{V}}{s_V} }[/math]

where [math]\displaystyle{ K = \frac{2\left(\Gamma\left(\frac{n - 1}{2}\right)\right)^2}{\left(\Gamma\left(\frac{n - 2}{2}\right)\right)^2}. }[/math]

A confidence interval for SMCV can be made using the following non-central t-distribution:[1]

[math]\displaystyle{ T = \frac{\bar{V}}{s_V/\sqrt{n}} \sim \text{noncentral } t\left(n - 1, \sqrt{n}\lambda\right). }[/math]

See also

References

  1. 1.00 1.01 1.02 1.03 1.04 1.05 1.06 1.07 1.08 1.09 1.10 Zhang XHD (2011). Optimal High-Throughput Screening: Practical Experimental Design and Data Analysis for Genome-scale RNAi Research. Cambridge University Press. ISBN 978-0-521-73444-8. 
  2. 2.0 2.1 2.2 2.3 2.4 2.5 2.6 Zhang XHD (2009). "A method for effectively comparing gene effects in multiple conditions in RNAi and expression-profiling research". Pharmacogenomics 10: 345–58. doi:10.2217/14622416.10.3.345. PMID 20397965. 
  3. 3.0 3.1 Zhang XHD (2010). "Assessing the size of gene or RNAi effects in multifactor high-throughput experiments". Pharmacogenomics 11: 199–213. doi:10.2217/PGS.09.136. PMID 20136359. 
  4. Contrasts and Effect Sizes in Behavioral Research. Cambridge University Press. 2000. ISBN 0-521-65980-9. 
  5. Huberty CJ (2002). "A history of effect size indices". Educational and Psychological Measurement 62: 227–40. doi:10.1177/0013164402062002002. 
  6. Kirk RE (1996). "Practical significance: A concept whose time has come". Educational and Psychological Measurement 56: 746–59. doi:10.1177/0013164496056005002. 
  7. Cohen J (1962). "The statistical power of abnormal-social psychological research: A review". Journal of Abnormal and Social Psychology 65: 145–53. doi:10.1037/h0045186. PMID 13880271. 
  8. Glass GV (1976). "Primary, secondary, and meta-analysis of research". Educational Researcher 5: 3–8. doi:10.3102/0013189X005010003. 
  9. Steiger JH (2004). "Beyond the F test: Effect size confidence intervals and tests of close fit in the analysis of variance and contrast analysis". Psychological Methods 9: 164–82. doi:10.1037/1082-989x.9.2.164. PMID 15137887.