Rank statistic

A statistic (cf. Statistical estimator) constructed from a rank vector. If $ R = ( R _ {1} \dots R _ {n} ) $ is the rank vector constructed from a random observation vector $ X = ( X _ {1} \dots X _ {n} ) $, then any statistic $ T = T ( R) $ which is a function of $ R $ is called a rank statistic. A classical example of a rank statistic is the Kendall coefficient of rank correlation $ \tau $ between the vectors $ R $ and $ \ell = ( 1 \dots n ) $, defined by the formula

$$ \tau = \frac{1}{n ( n - 1 ) }

\sum _ {i \neq j }

\mathop{\rm sign} ( i - j ) \ 
\mathop{\rm sign} ( R _ {i} - R _ {j} ) .

$$

In the class of all rank statistics a special place is occupied by so-called linear rank statistics, defined as follows. Let $ A = \| a ( i , j ) \| $ be an arbitrary square matrix of order $ n $. Then the statistic

$$ T = \sum _ { i=1} ^ { n } a ( i , R _ {i} ) $$

is called a linear rank statistic. For example, the Spearman coefficient of rank correlation $ \rho $, defined by the formula

$$ \rho = \frac{12}{n ( n - 1 ) }

\sum _ { i=1} ^ { n } \left ( i - n+ \frac{1}{2}

\right )

\left ( R _ {i} - n+ \frac{1}{2}

\right ) ,

$$

is a linear rank statistic.

Linear rank statistics are, as a rule, simple to construct from the computational point of view and their distributions are easy to find. For this reason the notion of projection of a rank statistic into the family of linear rank statistics plays an important role in the theory of rank statistics. If $ T $ is a rank statistic constructed from a random vector $ X $ under a hypothesis $ H _ {0} $ about its distribution, then a linear rank statistic $ \widehat{T} = \widehat{T} ( R) $ such that $ {\mathsf E} \{ ( T - \widehat{T} ) ^ {2} \} $ is minimal under the condition that $ H _ {0} $ is true, is called the projection of $ T $ into the family of linear rank statistics. As a rule, $ \widehat{T} $ approximates $ T $ well enough and the difference $ T - \widehat{T} $ is negligibly small as $ n \rightarrow \infty $. If the hypothesis $ H _ {0} $ under which the components $ X _ {1} \dots X _ {n} $ of the random vector $ X $ are independent random variables is true, then the projection $ \widehat{T} $ of $ T $ can be determined by the formula

$$ \tag{* } \widehat{T} = n- \frac{1}{n}

\sum _ { i=1} ^ { n } \widehat{a} ( i , R _ {i} ) - ( n - 2 ) {\mathsf E} \{ T \} , $$

where $ \widehat{a} ( i , j ) = {\mathsf E} \{ T \mid R _ {i} = j \} $, $ 1 \leq i , j \leq n $ (see [1]).

There is an intrinsic connection between $ \tau $ and $ \rho $. It is shown in [1] that the projection $ \widehat \tau $ of the Kendall coefficient $ \tau $ into the family of linear rank statistics coincides, up to a multiplicative constant, with the Spearman coefficient $ \rho $; namely,

$$ \widehat \tau = \frac{2}{3}

\left ( 1 + \frac{1}{n}

\right ) \rho .

$$

This equality implies that the correlation coefficient $ \mathop{\rm corr} ( \rho , \tau ) $ between $ \rho $ and $ \tau $ is equal to

$$

\mathop{\rm corr} ( \rho , \tau )  = \

\sqrt { \frac{ {\mathsf D} \widehat \tau }{ {\mathsf D} \tau }

}  = \

\frac{2 ( n + 1 ) }{\sqrt {2 n ( 2 n + 5 ) } }

$$

implying that these rank statistics are asymptotically equivalent for large $ n $( cf. [2]).

References

[1]	J. Hájek, Z. Sidák, "Theory of rank tests" , Acad. Press (1967)
[2]	M.G. Kendall, "Rank correlation methods" , Griffin (1970)

0.00

(0 votes)

From The Encyclopedia of Math: Rank statistic.

Anonymous

Search

Rank statistic

Namespaces

More

Page actions

References

Navigation

Navigation

Resources

Help

googletranslator

Navigation

Wiki tools

Wiki tools

Anonymous

Search

Rank statistic

References

Navigation

Wiki tools

Page tools

Other projects

Categories