Stein's unbiased risk estimate

In statistics, Stein's unbiased risk estimate (SURE) is an unbiased estimator of the mean-squared error of "a nearly arbitrary, nonlinear biased estimator."^[1] In other words, it provides an indication of the accuracy of a given estimator. This is important since the true mean-squared error of an estimator is a function of the unknown parameter to be estimated, and thus cannot be determined exactly. The technique is named after its discoverer, Charles Stein.^[2]

Formal statement

Let [math]\displaystyle{ \mu \in {\mathbb R}^d }[/math] be an unknown parameter and let [math]\displaystyle{ x \in {\mathbb R}^d }[/math] be a measurement vector whose components are independent and distributed normally with mean [math]\displaystyle{ \mu_i, i=1,...,d, }[/math] and variance [math]\displaystyle{ \sigma^2 }[/math]. Suppose [math]\displaystyle{ h(x) }[/math] is an estimator of [math]\displaystyle{ \mu }[/math] from [math]\displaystyle{ x }[/math], and can be written [math]\displaystyle{ h(x) = x + g(x) }[/math], where [math]\displaystyle{ g }[/math] is weakly differentiable. Then, Stein's unbiased risk estimate is given by^[3]

[math]\displaystyle{ \operatorname{SURE}(h) = d\sigma^2 + \|g(x)\|^2 + 2 \sigma^2 \sum_{i=1}^d \frac{\partial}{\partial x_i} g_i(x) = -d\sigma^2 + \|g(x)\|^2 + 2 \sigma^2 \sum_{i=1}^d \frac{\partial}{\partial x_i} h_i(x), }[/math]

where [math]\displaystyle{ g_i(x) }[/math] is the [math]\displaystyle{ i }[/math]th component of the function [math]\displaystyle{ g(x) }[/math], and [math]\displaystyle{ \|\cdot\| }[/math] is the Euclidean norm.

The importance of SURE is that it is an unbiased estimate of the mean-squared error (or squared error risk) of [math]\displaystyle{ h(x) }[/math], i.e.

[math]\displaystyle{ \operatorname E_\mu \{ \operatorname{SURE}(h) \} = \operatorname{MSE}(h),\,\! }[/math]

with

[math]\displaystyle{ \operatorname{MSE}(h) = \operatorname E_\mu \|h(x)-\mu\|^2. }[/math]

Thus, minimizing SURE can act as a surrogate for minimizing the MSE. Note that there is no dependence on the unknown parameter [math]\displaystyle{ \mu }[/math] in the expression for SURE above. Thus, it can be manipulated (e.g., to determine optimal estimation settings) without knowledge of [math]\displaystyle{ \mu }[/math].

Proof

We wish to show that

[math]\displaystyle{ \operatorname E_\mu \|h(x)-\mu\|^2 = \operatorname E_\mu \{ \operatorname{SURE}(h) \}. }[/math]

We start by expanding the MSE as

[math]\displaystyle{ \begin{align} \operatorname E_\mu \| h(x) - \mu\|^2 & = \operatorname E_\mu \|g(x) + x - \mu\|^2 \\ & = \operatorname E_\mu \|g(x)\|^2 + \operatorname E_\mu \|x - \mu\|^2 + 2 \operatorname E_\mu g(x)^T (x - \mu) \\ & = \operatorname E_\mu \|g(x)\|^2 + d \sigma^2 + 2 \operatorname E_\mu g(x)^T(x - \mu). \end{align} }[/math]

Now we use integration by parts to rewrite the last term:

[math]\displaystyle{ \begin{align} \operatorname E_\mu g(x)^T(x - \mu) & = \int_{{\mathbb R}^d} \frac{1}{\sqrt{2 \pi \sigma^{2d}}} \exp\left(-\frac{\|x - \mu\|^2}{2 \sigma^2} \right) \sum_{i=1}^d g_i(x) (x_i - \mu_i) d^d x \\ & = \sigma^2 \sum_{i=1}^d\int_{{\mathbb R}^d} \frac{1}{\sqrt{2 \pi \sigma^{2d}}} \exp\left(-\frac{\|x - \mu\|^2}{2 \sigma^2} \right) \frac{dg_i}{dx_i} d^d x \\ & = \sigma^2 \sum_{i=1}^d \operatorname E_\mu \frac{dg_i}{dx_i}. \end{align} }[/math]

Substituting this into the expression for the MSE, we arrive at

[math]\displaystyle{ \operatorname E_\mu \|h(x) - \mu\|^2 = \operatorname E_\mu \left( d\sigma^2 + \|g(x)\|^2 + 2\sigma^2 \sum_{i=1}^d \frac{dg_i}{dx_i}\right). }[/math]

Applications

A standard application of SURE is to choose a parametric form for an estimator, and then optimize the values of the parameters to minimize the risk estimate. This technique has been applied in several settings. For example, a variant of the James–Stein estimator can be derived by finding the optimal shrinkage estimator.^[2] The technique has also been used by Donoho and Johnstone to determine the optimal shrinkage factor in a wavelet denoising setting.^[1]

References

↑ ^1.0 ^1.1 Donoho, David L.; Iain M. Johnstone (December 1995). "Adapting to Unknown Smoothness via Wavelet Shrinkage". Journal of the American Statistical Association 90 (432): 1200–1244. doi:10.2307/2291512.
↑ ^2.0 ^2.1 Stein, Charles M. (November 1981). "Estimation of the Mean of a Multivariate Normal Distribution". The Annals of Statistics 9 (6): 1135–1151. doi:10.1214/aos/1176345632.
↑ Wasserman, Larry (2005). All of Nonparametric Statistics.

0.00

(0 votes)

Original source: https://en.wikipedia.org/wiki/Stein's unbiased risk estimate. Read more

[donoho95-1] 1.0 ^1.1 Donoho, David L.; Iain M. Johnstone (December 1995). "Adapting to Unknown Smoothness via Wavelet Shrinkage". Journal of the American Statistical Association 90 (432): 1200–1244. doi:10.2307/2291512.

[stein81-2] 2.0 ^2.1 Stein, Charles M. (November 1981). "Estimation of the Mean of a Multivariate Normal Distribution". The Annals of Statistics 9 (6): 1135–1151. doi:10.1214/aos/1176345632.

[wasserman05-3] Wasserman, Larry (2005). All of Nonparametric Statistics.

[1]

[2]

[3]

Anonymous

Search

Stein's unbiased risk estimate

Namespaces

More

Page actions

Contents

Formal statement

Proof

Applications

References

Navigation

Navigation

Help

Translate

Wiki tools

Wiki tools

Anonymous

Search

Stein's unbiased risk estimate

Formal statement

Proof

Applications

References

Navigation

Wiki tools

Page tools

Other projects

Categories