Standard deviation line

From HandWiki
Plot of the standard deviation line (SD line), dashed, and the regression line, solid, for a scatter diagram of 20 points.

In statistics, the standard deviation line (or SD line) marks points on a scatter plot that are an equal number of standard deviations away from the average in each dimension. For example, in a 2-dimensional scatter diagram with variables [math]\displaystyle{ x }[/math] and [math]\displaystyle{ y }[/math], points that are 1 standard deviation away from the mean of [math]\displaystyle{ x }[/math] and also 1 standard deviation away from the mean of [math]\displaystyle{ y }[/math] are on the SD line.[1] The SD line is a useful visual tool since points in a scatter diagram tend to cluster around it,[1] more or less tightly depending on their correlation.

Properties

Relation to regression line

The SD line goes through the point of averages and has a slope of [math]\displaystyle{ \frac{\sigma_y}{\sigma_x} }[/math] when the correlation between [math]\displaystyle{ x }[/math] and [math]\displaystyle{ y }[/math] is positive, and [math]\displaystyle{ -\frac{\sigma_y}{\sigma_x} }[/math] when the correlation is negative.[1][2] Unlike the regression line, the SD line does not take into account the relationship between [math]\displaystyle{ x }[/math] and [math]\displaystyle{ y }[/math].[3] The slope of the SD line is related to that of the regression line by [math]\displaystyle{ a = r \frac{\sigma_y}{\sigma_x} }[/math] where [math]\displaystyle{ a }[/math] is the slope of the regression line, [math]\displaystyle{ r }[/math] is the correlation coefficient, and [math]\displaystyle{ \frac{\sigma_y}{\sigma_x} }[/math] is the magnitude of the slope of the SD line.[2]

Typical distance of points to SD line

The root mean square vertical distance of points from the SD line is [math]\displaystyle{ \sqrt{2(1 - |r|)} \times\sigma_y }[/math].[1] This gives an idea of the spread of points around the SD line.