Least trimmed squares
Least trimmed squares (LTS), or least trimmed sum of squares, is a robust statistical method that fits a function to a set of data whilst not being unduly affected by the presence of outliers[1] . It is one of a number of methods for robust regression.
Description of method
Instead of the standard least squares method, which minimises the sum of squared residuals over n points, the LTS method attempts to minimise the sum of squared residuals over a subset, [math]\displaystyle{ k }[/math], of those points. The unused [math]\displaystyle{ n - k }[/math] points do not influence the fit.
In a standard least squares problem, the estimated parameter values β are defined to be those values that minimise the objective function S(β) of squared residuals:
- [math]\displaystyle{ S = \sum_{i=1}^n r_i(\beta)^2, }[/math]
where the residuals are defined as the differences between the values of the dependent variables (observations) and the model values:
- [math]\displaystyle{ r_i(\beta) = y_i - f(x_i, \beta), }[/math]
and where n is the overall number of data points. For a least trimmed squares analysis, this objective function is replaced by one constructed in the following way. For a fixed value of β, let [math]\displaystyle{ r_{(j)}(\beta) }[/math] denote the set of ordered absolute values of the residuals (in increasing order of absolute value). In this notation, the standard sum of squares function is
- [math]\displaystyle{ S(\beta) = \sum_{j=1}^n r_{(j)}(\beta)^2, }[/math]
while the objective function for LTS is
- [math]\displaystyle{ S_k(\beta) = \sum_{j=1}^k r_{(j)}(\beta)^2. }[/math]
Computational considerations
Because this method is binary, in that points are either included or excluded, no closed-form solution exists. As a result, methods for finding the LTS solution sift through combinations of the data, attempting to find the k subset that yields the lowest sum of squared residuals. Methods exist for low n that will find the exact solution; however, as n rises, the number of combinations grows rapidly, thus yielding methods that attempt to find approximate (but generally sufficient) solutions.
References
- ↑ Fox, John (2015). "19". Applied Regression Analysis and Generalized Linear Models (3rd ed.). Thousand Oaks.
- Rousseeuw, P. J. (1984). "Least Median of Squares Regression". Journal of the American Statistical Association 79 (388): 871–880. doi:10.1080/01621459.1984.10477105.
- Rousseeuw, P. J.; Leroy, A. M. (2005). Robust Regression and Outlier Detection. Wiley. doi:10.1002/0471725382. ISBN 978-0-471-85233-9.
- Li, L. M. (2005). "An algorithm for computing exact least-trimmed squares estimate of simple linear regression with constraints". Computational Statistics & Data Analysis 48 (4): 717–734. doi:10.1016/j.csda.2004.04.003.
- Atkinson, A. C.; Cheng, T.-C. (1999). "Computing least trimmed squares regression with the forward search". Statistics and Computing 9 (4): 251–263. doi:10.1023/A:1008942604045.
- Jung, Kang-Mo (2007). "Least Trimmed Squares Estimator in the Errors-in-Variables Model". Journal of Applied Statistics 34 (3): 331–338. doi:10.1080/02664760601004973.
Original source: https://en.wikipedia.org/wiki/Least trimmed squares.
Read more |