Mean signed deviation

From HandWiki

In statistics, the mean signed difference (MSD), also known as mean signed deviation and mean signed error, is a sample statistic that summarises how well a set of estimates [math]\displaystyle{ \hat{\theta}_i }[/math] match the quantities [math]\displaystyle{ \theta_i }[/math] that they are supposed to estimate. It is one of a number of statistics that can be used to assess an estimation procedure, and it would often be used in conjunction with a sample version of the mean square error.

For example, suppose a linear regression model has been estimated over a sample of data, and is then used to extrapolate predictions of the dependent variable out of sample after the out-of-sample data points have become available. Then [math]\displaystyle{ \theta_i }[/math] would be the i-th out-of-sample value of the dependent variable, and [math]\displaystyle{ \hat{\theta}_i }[/math] would be its predicted value. The mean signed deviation is the average value of [math]\displaystyle{ \hat{\theta}_i-\theta_i. }[/math]

Definition

The mean signed difference is derived from a set of n pairs, [math]\displaystyle{ ( \hat{\theta}_i,\theta_i) }[/math], where [math]\displaystyle{ \hat{\theta}_i }[/math] is an estimate of the parameter [math]\displaystyle{ \theta }[/math] in a case where it is known that [math]\displaystyle{ \theta=\theta_i }[/math]. In many applications, all the quantities [math]\displaystyle{ \theta_i }[/math] will share a common value. When applied to forecasting in a time series analysis context, a forecasting procedure might be evaluated using the mean signed difference, with [math]\displaystyle{ \hat{\theta}_i }[/math] being the predicted value of a series at a given lead time and [math]\displaystyle{ \theta_i }[/math] being the value of the series eventually observed for that time-point. The mean signed difference is defined to be

[math]\displaystyle{ \operatorname{MSD}(\hat{\theta}) = \frac{1}{n}\sum^{n}_{i=1} \hat{\theta_{i}} - \theta_{i} . }[/math]

Use Cases

The mean signed difference is often useful when the estimations [math]\displaystyle{ \hat{\theta_i} }[/math] are biased from the true values [math]\displaystyle{ \theta_i }[/math] in a certain direction. If the estimator that produces the [math]\displaystyle{ \hat{\theta_i} }[/math] values is unbiased, then [math]\displaystyle{ \operatorname{MSD}(\hat{\theta_i})=0 }[/math]. However, if the estimations [math]\displaystyle{ \hat{\theta_i} }[/math] are produced by a biased estimator, then the mean signed difference is a useful tool to understand the direction of the estimator's bias.

See also