Uniformly most powerful test

From HandWiki
Short description: Theoretically optimal hypothesis test

In statistical hypothesis testing, a uniformly most powerful (UMP) test is a hypothesis test which has the greatest power 1β among all possible tests of a given size α. For example, according to the Neyman–Pearson lemma, the likelihood-ratio test is UMP for testing simple (point) hypotheses.

Setting

Let X denote a random vector (corresponding to the measurements), taken from a parametrized family of probability density functions or probability mass functions  fθ(x) , which depends on the unknown deterministic parameter  θΘ. The parameter space  Θ  is partitioned into two disjoint sets  Θ0  and  Θ1. Let H0 denote the hypothesis that  θΘ0 , and let  H1  denote the hypothesis that  θΘ1. The binary test of hypotheses is performed using a test function  φ(x)  with a reject region  R  (a subset of measurement space).

φ(x)={1 if xR0 if xRC

meaning that  H1  is in force if the measurement  XR  and that  H0  is in force if the measurement  XRC. Note that  RRc  is a disjoint covering of the measurement space.

Formal definition

A test function φ(x) is UMP of size α if for any other test function φ(x) satisfying

supθΘ0E[φ(X)|θ]=αα=supθΘ0E[φ(X)|θ]

we have

θΘ1,E[φ(X)|θ]=1β(θ)1β(θ)=E[φ(X)|θ].

The Karlin–Rubin theorem

The Karlin–Rubin theorem (named for Samuel Karlin and Herman Rubin) can be regarded as an extension of the Neyman–Pearson lemma for composite hypotheses.[1] Consider a scalar measurement having a probability density function parameterized by a scalar parameter θ, and define the likelihood ratio l(x)=fθ1(x)/fθ0(x). If l(x) is monotone non-decreasing, in x, for any pair θ1θ0 (meaning that the greater x is, the more likely H1 is), then the threshold test:

φ(x)={1if x>x00if x<x0
where x0 is chosen such that Eθ0φ(X)=α

is the UMP test of size α for testing H0:θθ0 vs. H1:θ>θ0.

Note that exactly the same test is also UMP for testing H0:θ=θ0 vs. H1:θ>θ0.

Important case: exponential family

Although the Karlin-Rubin theorem may seem weak because of its restriction to scalar parameter and scalar measurement, it turns out that there exist a host of problems for which the theorem holds. In particular, the one-dimensional exponential family of probability density functions or probability mass functions with

fθ(x)=g(θ)h(x)exp(η(θ)T(x))

has a monotone non-decreasing likelihood ratio in the sufficient statistic T(x), provided that η(θ) is non-decreasing.

Example

Let X=(X0,,XM1) denote i.i.d. normally distributed N-dimensional random vectors with mean θm and covariance matrix R. We then have

fθ(X)=(2π)MN/2|R|M/2exp{12n=0M1(Xnθm)TR1(Xnθm)}=(2π)MN/2|R|M/2exp{12n=0M1(θ2mTR1m)}exp{12n=0M1XnTR1Xn}exp{θmTR1n=0M1Xn}

which is exactly in the form of the exponential family shown in the previous section, with the sufficient statistic being

T(X)=mTR1n=0M1Xn.

Thus, we conclude that the test

φ(T)={1T>t00T<t0Eθ0φ(T)=α

is the UMP test of size α for testing H0:θθ0 vs. H1:θ>θ0

Further discussion

In general, UMP tests do not exist for vector parameters or for two-sided tests (a test in which one hypothesis lies on both sides of the alternative). The reason is that in these situations, the most powerful test of a given size for one possible value of the parameter (e.g. for θ1 where θ1>θ0) is different from the most powerful test of the same size for a different value of the parameter (e.g. for θ2 where θ2<θ0). As a result, no test is uniformly most powerful in these situations.

References

  1. Casella, G.; Berger, R.L. (2008), Statistical Inference, Brooks/Cole. ISBN 0-495-39187-5 (Theorem 8.3.17)

Further reading

  • Ferguson, T. S. (1967). Mathematical Statistics: A decision theoretic approach. New York: Academic Press. 
  • Mood, A. M.; Graybill, F. A.; Boes, D. C. (1974). Introduction to the theory of statistics (3rd ed.). New York: McGraw-Hill. 
  • L. L. Scharf, Statistical Signal Processing, Addison-Wesley, 1991, section 4.7.