Lehmann–Scheffé theorem

From HandWiki

In statistics, the Lehmann–Scheffé theorem is a prominent statement, tying together the ideas of completeness, sufficiency, uniqueness, and best unbiased estimation.[1] The theorem states that any estimator which is unbiased for a given unknown quantity and that depends on the data only through a complete, sufficient statistic is the unique best unbiased estimator of that quantity. The Lehmann–Scheffé theorem is named after Erich Leo Lehmann and Henry Scheffé, given their two early papers.[2][3]

If T is a complete sufficient statistic for θ and E(g(T)) = τ(θ) then g(T) is the uniformly minimum-variance unbiased estimator (UMVUE) of τ(θ).

Statement

Let [math]\displaystyle{ \vec{X}= X_1, X_2, \dots, X_n }[/math] be a random sample from a distribution that has p.d.f (or p.m.f in the discrete case) [math]\displaystyle{ f(x:\theta) }[/math] where [math]\displaystyle{ \theta \in \Omega }[/math] is a parameter in the parameter space. Suppose [math]\displaystyle{ Y = u(\vec{X}) }[/math] is a sufficient statistic for θ, and let [math]\displaystyle{ \{ f_Y(y:\theta): \theta \in \Omega\} }[/math] be a complete family. If [math]\displaystyle{ \varphi:\operatorname{E}[\varphi(Y)] = \theta }[/math] then [math]\displaystyle{ \varphi(Y) }[/math] is the unique MVUE of θ.

Proof

By the Rao–Blackwell theorem, if [math]\displaystyle{ Z }[/math] is an unbiased estimator of θ then [math]\displaystyle{ \varphi(Y):= \operatorname{E}[Z\mid Y] }[/math] defines an unbiased estimator of θ with the property that its variance is not greater than that of [math]\displaystyle{ Z }[/math].

Now we show that this function is unique. Suppose [math]\displaystyle{ W }[/math] is another candidate MVUE estimator of θ. Then again [math]\displaystyle{ \psi(Y):= \operatorname{E}[W\mid Y] }[/math] defines an unbiased estimator of θ with the property that its variance is not greater than that of [math]\displaystyle{ W }[/math]. Then

[math]\displaystyle{ \operatorname{E}[\varphi(Y) - \psi(Y)] = 0, \theta \in \Omega. }[/math]

Since [math]\displaystyle{ \{ f_Y(y:\theta): \theta \in \Omega\} }[/math] is a complete family

[math]\displaystyle{ \operatorname{E}[\varphi(Y) - \psi(Y)] = 0 \implies \varphi(y) - \psi(y) = 0, \theta \in \Omega }[/math]

and therefore the function [math]\displaystyle{ \varphi }[/math] is the unique function of Y with variance not greater than that of any other unbiased estimator. We conclude that [math]\displaystyle{ \varphi(Y) }[/math] is the MVUE.

Example for when using a non-complete minimal sufficient statistic

An example of an improvable Rao–Blackwell improvement, when using a minimal sufficient statistic that is not complete, was provided by Galili and Meilijson in 2016.[4] Let [math]\displaystyle{ X_1, \ldots, X_n }[/math] be a random sample from a scale-uniform distribution [math]\displaystyle{ X \sim U ( (1-k) \theta, (1+k) \theta), }[/math] with unknown mean [math]\displaystyle{ \operatorname{E}[X]=\theta }[/math] and known design parameter [math]\displaystyle{ k \in (0,1) }[/math]. In the search for "best" possible unbiased estimators for [math]\displaystyle{ \theta }[/math], it is natural to consider [math]\displaystyle{ X_1 }[/math] as an initial (crude) unbiased estimator for [math]\displaystyle{ \theta }[/math] and then try to improve it. Since [math]\displaystyle{ X_1 }[/math] is not a function of [math]\displaystyle{ T = \left( X_{(1)}, X_{(n)} \right) }[/math], the minimal sufficient statistic for [math]\displaystyle{ \theta }[/math] (where [math]\displaystyle{ X_{(1)} = \min_i X_i }[/math] and [math]\displaystyle{ X_{(n)} = \max_i X_i }[/math]), it may be improved using the Rao–Blackwell theorem as follows:

[math]\displaystyle{ \hat{\theta}_{RB} =\operatorname{E}_\theta[X_1\mid X_{(1)}, X_{( n)}] = \frac{X_{(1)}+X_{(n)}} 2. }[/math]

However, the following unbiased estimator can be shown to have lower variance:

[math]\displaystyle{ \hat{\theta}_{LV} = \frac 1 {k^2\frac{n-1}{n+1}+1} \cdot \frac{(1-k)X_{(1)} + (1+k) X_{(n)}} 2. }[/math]

And in fact, it could be even further improved when using the following estimator:

[math]\displaystyle{ \hat{\theta}_\text{BAYES}=\frac{n+1} n \left[1- \frac{\frac{X_{(1)} (1+k)}{X_{(n)} (1-k)}-1}{ \left (\frac{X_{(1)} (1+k)}{X_{(n)} (1-k)}\right )^{n+1} -1} \right] \frac{X_{(n)}}{1+k} }[/math]

The model is a scale model. Optimal equivariant estimators can then be derived for loss functions that are invariant.[5]

See also

References

  1. Casella, George (2001). Statistical Inference. Duxbury Press. p. 369. ISBN 978-0-534-24312-8. 
  2. "Completeness, similar regions, and unbiased estimation. I.". Sankhyā 10 (4): 305–340. 1950. doi:10.1007/978-1-4614-1412-4_23. 
  3. "Completeness, similar regions, and unbiased estimation. II.". Sankhyā 15 (3): 219–236. 1955. doi:10.1007/978-1-4614-1412-4_24. 
  4. Tal Galili & Isaac Meilijson (31 Mar 2016). "An Example of an Improvable Rao–Blackwell Improvement, Inefficient Maximum Likelihood Estimator, and Unbiased Generalized Bayes Estimator". The American Statistician 70 (1): 108–113. doi:10.1080/00031305.2015.1100683. PMID 27499547. 
  5. Taraldsen, Gunnar (2020). "Micha Mandel (2020), “The Scaled Uniform Model Revisited,” The American Statistician, 74:1, 98–100: Comment". The American Statistician 74 (3): 315–315. doi:10.1080/00031305.2020.1769727. https://doi.org/10.1080/00031305.2020.1769727.