Uniformly most powerful test

From HandWiki

In statistical hypothesis testing, a uniformly most powerful (UMP) test is a hypothesis test which has the greatest power [math]\displaystyle{ 1 - \beta }[/math] among all possible tests of a given size α. For example, according to the Neyman–Pearson lemma, the likelihood-ratio test is UMP for testing simple (point) hypotheses.

Setting

Let [math]\displaystyle{ X }[/math] denote a random vector (corresponding to the measurements), taken from a parametrized family of probability density functions or probability mass functions [math]\displaystyle{ f_{\theta}(x) }[/math], which depends on the unknown deterministic parameter [math]\displaystyle{ \theta \in \Theta }[/math]. The parameter space [math]\displaystyle{ \Theta }[/math] is partitioned into two disjoint sets [math]\displaystyle{ \Theta_0 }[/math] and [math]\displaystyle{ \Theta_1 }[/math]. Let [math]\displaystyle{ H_0 }[/math] denote the hypothesis that [math]\displaystyle{ \theta \in \Theta_0 }[/math], and let [math]\displaystyle{ H_1 }[/math] denote the hypothesis that [math]\displaystyle{ \theta \in \Theta_1 }[/math]. The binary test of hypotheses is performed using a test function [math]\displaystyle{ \varphi(x) }[/math] with a reject region [math]\displaystyle{ R }[/math] (a subset of measurement space).

[math]\displaystyle{ \varphi(x) = \begin{cases} 1 & \text{if } x \in R \\ 0 & \text{if } x \in R^c \end{cases} }[/math]

meaning that [math]\displaystyle{ H_1 }[/math] is in force if the measurement [math]\displaystyle{ X \in R }[/math] and that [math]\displaystyle{ H_0 }[/math] is in force if the measurement [math]\displaystyle{ X\in R^c }[/math]. Note that [math]\displaystyle{ R \cup R^c }[/math] is a disjoint covering of the measurement space.

Formal definition

A test function [math]\displaystyle{ \varphi(x) }[/math] is UMP of size [math]\displaystyle{ \alpha }[/math] if for any other test function [math]\displaystyle{ \varphi'(x) }[/math] satisfying

[math]\displaystyle{ \sup_{\theta\in\Theta_0}\; \operatorname{E}[\varphi'(X)|\theta]=\alpha'\leq\alpha=\sup_{\theta\in\Theta_0}\; \operatorname{E}[\varphi(X)|\theta]\, }[/math]

we have

[math]\displaystyle{ \forall \theta \in \Theta_1, \quad \operatorname{E}[\varphi'(X)|\theta]= 1 - \beta'(\theta) \leq 1 - \beta(\theta) =\operatorname{E}[\varphi(X)|\theta]. }[/math]

The Karlin–Rubin theorem

The Karlin–Rubin theorem can be regarded as an extension of the Neyman–Pearson lemma for composite hypotheses.[1] Consider a scalar measurement having a probability density function parameterized by a scalar parameter θ, and define the likelihood ratio [math]\displaystyle{ l(x) = f_{\theta_1}(x) / f_{\theta_0}(x) }[/math]. If [math]\displaystyle{ l(x) }[/math] is monotone non-decreasing, in [math]\displaystyle{ x }[/math], for any pair [math]\displaystyle{ \theta_1 \geq \theta_0 }[/math] (meaning that the greater [math]\displaystyle{ x }[/math] is, the more likely [math]\displaystyle{ H_1 }[/math] is), then the threshold test:

[math]\displaystyle{ \varphi(x) = \begin{cases} 1 & \text{if } x \gt x_0 \\ 0 & \text{if } x \lt x_0 \end{cases} }[/math]
where [math]\displaystyle{ x_0 }[/math] is chosen such that [math]\displaystyle{ \operatorname{E}_{\theta_0}\varphi(X)=\alpha }[/math]

is the UMP test of size α for testing [math]\displaystyle{ H_0: \theta \leq \theta_0 \text{ vs. } H_1: \theta \gt \theta_0 . }[/math]

Note that exactly the same test is also UMP for testing [math]\displaystyle{ H_0: \theta = \theta_0 \text{ vs. } H_1: \theta \gt \theta_0 . }[/math]

Important case: exponential family

Although the Karlin-Rubin theorem may seem weak because of its restriction to scalar parameter and scalar measurement, it turns out that there exist a host of problems for which the theorem holds. In particular, the one-dimensional exponential family of probability density functions or probability mass functions with

[math]\displaystyle{ f_\theta(x) = g(\theta) h(x) \exp(\eta(\theta) T(x)) }[/math]

has a monotone non-decreasing likelihood ratio in the sufficient statistic [math]\displaystyle{ T(x) }[/math], provided that [math]\displaystyle{ \eta(\theta) }[/math] is non-decreasing.

Example

Let [math]\displaystyle{ X=(X_0 ,\ldots , X_{M-1}) }[/math] denote i.i.d. normally distributed [math]\displaystyle{ N }[/math]-dimensional random vectors with mean [math]\displaystyle{ \theta m }[/math] and covariance matrix [math]\displaystyle{ R }[/math]. We then have

[math]\displaystyle{ \begin{align} f_\theta (X) = {} & (2 \pi)^{-MN/2} |R|^{-M/2} \exp \left\{-\frac 1 2 \sum_{n=0}^{M-1} (X_n - \theta m)^T R^{-1}(X_n - \theta m) \right\} \\[4pt] = {} & (2 \pi)^{-MN/2} |R|^{-M/2} \exp \left\{-\frac 1 2 \sum_{n=0}^{M-1} \left (\theta^2 m^T R^{-1} m \right ) \right\} \\[4pt] & \exp \left\{-\frac 1 2 \sum_{n=0}^{M-1} X_n^T R^{-1} X_n \right\} \exp \left\{\theta m^T R^{-1} \sum_{n=0}^{M-1}X_n \right\} \end{align} }[/math]

which is exactly in the form of the exponential family shown in the previous section, with the sufficient statistic being

[math]\displaystyle{ T(X) = m^T R^{-1} \sum_{n=0}^{M-1}X_n. }[/math]

Thus, we conclude that the test

[math]\displaystyle{ \varphi(T) = \begin{cases} 1 & T \gt t_0 \\ 0 & T \lt t_0 \end{cases} \qquad \operatorname{E}_{\theta_0} \varphi (T) = \alpha }[/math]

is the UMP test of size [math]\displaystyle{ \alpha }[/math] for testing [math]\displaystyle{ H_0: \theta \leqslant \theta_0 }[/math] vs. [math]\displaystyle{ H_1: \theta \gt \theta_0 }[/math]

Further discussion

Finally, we note that in general, UMP tests do not exist for vector parameters or for two-sided tests (a test in which one hypothesis lies on both sides of the alternative). The reason is that in these situations, the most powerful test of a given size for one possible value of the parameter (e.g. for [math]\displaystyle{ \theta_1 }[/math] where [math]\displaystyle{ \theta_1 \gt \theta_0 }[/math]) is different from the most powerful test of the same size for a different value of the parameter (e.g. for [math]\displaystyle{ \theta_2 }[/math] where [math]\displaystyle{ \theta_2 \lt \theta_0 }[/math]). As a result, no test is uniformly most powerful in these situations.


References

  1. Casella, G.; Berger, R.L. (2008), Statistical Inference, Brooks/Cole. ISBN:0-495-39187-5 (Theorem 8.3.17)

Further reading

  • Ferguson, T. S. (1967). Mathematical Statistics: A decision theoretic approach. New York: Academic Press. 
  • Mood, A. M.; Graybill, F. A.; Boes, D. C. (1974). Introduction to the theory of statistics (3rd ed.). New York: McGraw-Hill. 
  • L. L. Scharf, Statistical Signal Processing, Addison-Wesley, 1991, section 4.7.