S-estimator

From HandWiki

The goal of S-estimators is to have a simple high-breakdown regression estimator, which share the flexibility and nice asymptotic properties of M-estimators. The name "S-estimators" was chosen as they are based on estimators of scale. We will consider estimators of scale defined by a function [math]\displaystyle{ \rho }[/math], which satisfy

  • R1 – [math]\displaystyle{ \rho }[/math] is symmetric, continuously differentiable and [math]\displaystyle{ \rho(0)=0 }[/math].
  • R2 – there exists [math]\displaystyle{ c \gt 0 }[/math] such that [math]\displaystyle{ \rho }[/math] is strictly increasing on [math]\displaystyle{ [c, \infty] }[/math]

For any sample [math]\displaystyle{ \{r_1, ..., r_n\} }[/math] of real numbers, we define the scale estimate [math]\displaystyle{ s(r_1, ..., r_n) }[/math] as the solution of

[math]\displaystyle{ \frac{1}{n}\sum_{i=1}^n \rho(r_i/s) = K }[/math],

where [math]\displaystyle{ K }[/math] is the expectation value of [math]\displaystyle{ \rho }[/math] for a standard normal distribution. (If there are more solutions to the above equation, then we take the one with the smallest solution for s; if there is no solution, then we put [math]\displaystyle{ s(r_1, ..., r_n)=0 }[/math] .)

Definition:

Let [math]\displaystyle{ (x_1, y_1), ..., (x_n, y_n) }[/math] be a sample of regression data with p-dimensional [math]\displaystyle{ x_i }[/math]. For each vector [math]\displaystyle{ \theta }[/math], we obtain residuals [math]\displaystyle{ s(r_1(\theta),..., r_n(\theta)) }[/math] by solving the equation of scale above, where [math]\displaystyle{ \rho }[/math] satisfy R1 and R2. The S-estimator [math]\displaystyle{ \hat\theta }[/math] is defined by

[math]\displaystyle{ \hat\theta = \min_\theta \, s(r_1(\theta),..., r_n(\theta)) }[/math]

and the final scale estimator [math]\displaystyle{ \hat \sigma }[/math] is then

[math]\displaystyle{ \hat\sigma = s(r_1(\hat\theta), ..., r_n(\hat\theta)) }[/math].[1]

References

  1. P. Rousseeuw and V. Yohai, Robust Regression by Means of S-estimators, from the book: Robust and nonlinear time series analysis, pages 256–272, 1984