Measure of presortedness

From HandWiki

In computer science, a measure of presortedness of a sequence represents how much work is required to sort the sequence. If the sequence is pre-sorted, sorting the sequence entirely require little work, hence it is expected to have a small measure of presortedness. In particular, the measure of a sorted sequence is 0.

Some sorting algorithms are more efficient on pre-sorted list, as they can use this pre-work into account to avoid duplicate work. The measure of presortedness allows to formalize the notion that an algorithm is optimal for a certain measure.

Definition

Let [math]\displaystyle{ \mathbb N^{*} }[/math] represents finite sequence of natural numbers. Let [math]\displaystyle{ m:\mathbb N^*\to\mathbb N }[/math] a function sending sequence of numbers to non-negative numbers. This function is said to be a measure of presortedness if it satisfies the following axioms:

  • if [math]\displaystyle{ X=\langle x_1,\dots,x_n\rangle }[/math] is in ascending order, then [math]\displaystyle{ m(X)=0 }[/math], that is, the measure of presortedness of sorted sequences is 0.
  • let [math]\displaystyle{ X=\langle x_1,\dots,x_n\rangle }[/math] and [math]\displaystyle{ Y=\langle y_1,\dots,y_n\rangle }[/math]. If [math]\displaystyle{ x_i\lt x_j }[/math] iff [math]\displaystyle{ y_i\lt y_j }[/math], then [math]\displaystyle{ m(X)=m(Y) }[/math]. That is, the measure do not depends on the actual numbers in the sequence, but only on their order compared to other elements of the sequences.
  • If [math]\displaystyle{ Y }[/math] is a subsequence of [math]\displaystyle{ X }[/math] then [math]\displaystyle{ m(Y)\le m(X) }[/math], that is, [math]\displaystyle{ m }[/math] is monotonic for the subsequence relation
  • let [math]\displaystyle{ X=\langle x_1,\dots,x_n\rangle }[/math] and [math]\displaystyle{ Y=\langle y_1,\dots,y_n\rangle }[/math]. If for each [math]\displaystyle{ i,j }[/math], [math]\displaystyle{ x_i\lt y_j }[/math] then [math]\displaystyle{ m(XY)\le m(X)+m(Y) }[/math], that is, the measure is subadditive
  • for each [math]\displaystyle{ x\in\mathbb N }[/math] and [math]\displaystyle{ X\in\mathbb N^{*} }[/math], [math]\displaystyle{ m(\langle x\rangle X)\le |X|+m(X) }[/math]. Intuitively, if you add an element [math]\displaystyle{ x }[/math], it can be in the wrong order compared to at most every other element of [math]\displaystyle{ X }[/math], and so the pre-sortedness can be increased by at most the length of [math]\displaystyle{ X }[/math].

Those axioms are similar to the axioms of measure theory. However, a measure of presortedness is not a measure as in measure theory, since it is not defined over a set but over sequence.

Optimal algorithm

It is well known that an optimal comparison sort requires to compare [math]\displaystyle{ O(n\log n) }[/math] pairs of elements in a sequence. That is, the decision tree complexity is [math]\displaystyle{ O(n\log n) }[/math]. However, when more informations is known about how much a sequence is presorted, more optimal bounds can be devised. This notion is now formalized.

Given a measure of pre-sortedness [math]\displaystyle{ m }[/math], a sequence [math]\displaystyle{ X=\langle x_1,\dots,x_n\rangle }[/math], [math]\displaystyle{ z\ge m(X) }[/math], let [math]\displaystyle{ \mathtt{below}(z,X, m) }[/math] be the set of permutations of [math]\displaystyle{ X }[/math] whose measure is at most [math]\displaystyle{ Z }[/math]. Then let [math]\displaystyle{ \mathtt{below}(X, m)=\mathtt{below}(m(X), X, m) }[/math] the set of permutations whose measure is at most the measure of [math]\displaystyle{ X }[/math].

A sorting algorithm [math]\displaystyle{ S }[/math] which takes as input [math]\displaystyle{ X }[/math] and [math]\displaystyle{ z }[/math] is said to be weakly [math]\displaystyle{ m }[/math]-optimal if its time complexity is [math]\displaystyle{ O(\max(n, \log|\mathtt{below}(z, X, m)|)) }[/math]. That is, the number of operations is logarithmic in the number of possible permutations whose measure is at most [math]\displaystyle{ z }[/math]. This is optimal in the sens that simply selecting an element in a set of size [math]\displaystyle{ N }[/math] requires [math]\displaystyle{ \log(N) }[/math] bits of information.

Similarly, a sorting algorithm which takes as input [math]\displaystyle{ X }[/math] is [math]\displaystyle{ m }[/math]-optimal if its time complexit is [math]\displaystyle{ O(\max(n, \log|\mathtt{below}(z, X, m)|)) }[/math]. When [math]\displaystyle{ m(X) }[/math] is computable in linear time, a weakly [math]\displaystyle{ m }[/math]-optimal algorithm is automatically [math]\displaystyle{ m }[/math]-optimal. However, if [math]\displaystyle{ m }[/math] is more complex, it is possible that computing [math]\displaystyle{ m(X) }[/math] is actully more costly than sorting the sequence in the first place, and thus only an upper bound [math]\displaystyle{ z }[/math] of [math]\displaystyle{ m(X) }[/math] can be used.

Similarly, a decision tree is weakly [math]\displaystyle{ m }[/math]-optimal if its complexity is [math]\displaystyle{ O(\max(n, \log|\mathtt{below}(z, X, m)|)) }[/math] and it is [math]\displaystyle{ m }[/math]-optimal if its complexity [math]\displaystyle{ O(\max(n, \log|\mathtt{below}(X, m)|)) }[/math].

Existence of decision tree

For each measure [math]\displaystyle{ m }[/math], there exists a weakly [math]\displaystyle{ m }[/math]-optimal decision tree.

If the decision tree complexity of [math]\displaystyle{ m }[/math] is [math]\displaystyle{ O(\max(|X|, \log|\mathtt{below}(X, m)|)) }[/math], there exists an [math]\displaystyle{ m }[/math]-optimal decision tree. This decision tree's root compute [math]\displaystyle{ m(X) }[/math]. Once [math]\displaystyle{ m(X) }[/math] is computed in a node, the weakly [math]\displaystyle{ m }[/math]-optimal decision tree for [math]\displaystyle{ m(X) }[/math] is appended at this node.

The existence of the tree does not implie that there exists (weakly) [math]\displaystyle{ m }[/math]-optimal sorting algorithm, as computing the decision tree may be costly.

Examples

Here is a list of measures of presortedness [math]\displaystyle{ m }[/math] and related (weakly) [math]\displaystyle{ m }[/math]-optimal algorithms. Let [math]\displaystyle{ X=\langle x_1,\dots,x_n\rangle }[/math] be fixed for the remaining of the section.

Let us introduce a sorting algorithm that will be optimal for multiple measure. Let's call local insertion sort the algorithm that consists in iterating on the sequence and adding each element in a finger tree.

The zero measure

The constant function 0 is the most trivial measure of presortedness. Any sorting function running in [math]\displaystyle{ O(n\log n) }[/math]-time, such as heapsort and mergesort is [math]\displaystyle{ 0 }[/math]-optimal.

The indicator function

The indicator function [math]\displaystyle{ m_{01} }[/math] of sorted sequence is a measure of sortedness. This function sends sorted sequences to 0 and all other sequences to 1. Any algorithm that check whether a list is sorted, and if it is not sorted apply a [math]\displaystyle{ 0 }[/math]-optimal sorting algorithm is [math]\displaystyle{ m_{01} }[/math]-optimal.

Number of runs

The number of runs, [math]\displaystyle{ \mathtt{Runs}(X) }[/math], is the number of increasing sequences in [math]\displaystyle{ X }[/math] minus one. It is equivalently defined as the number of [math]\displaystyle{ i }[/math] such that [math]\displaystyle{ x_{i+1}\lt x_i }[/math]. is a measure of presortedness.

Natural merge sort, which consists in using existing runs and merging them, is a [math]\displaystyle{ \mathtt{Runs} }[/math]-optimal algorithm. The local insertion sort is also Runs-optimal.

Number of inversions

The number of inversion in [math]\displaystyle{ X }[/math], [math]\displaystyle{ \mathtt{Inv}(X) }[/math], is the number of pairs [math]\displaystyle{ i\lt j }[/math] such that [math]\displaystyle{ x_i\gt x_j }[/math]. It is a measure of presortedness and the local insertion sort is Inv-optimal.

Number of deletions

The mininimum number [math]\displaystyle{ \mathtt{Rem}(X) }[/math] of elements that need to be removed from [math]\displaystyle{ X }[/math] to obtain a sorted sequences. It is a measure of presortedness and the local insertion sort is Rem-optimal.

Radius

The radius [math]\displaystyle{ \mathtt{Rad}(X) }[/math] of [math]\displaystyle{ X }[/math][1] is a measure of presortdness.

The following algorithm is Rad-optimal. It consists in iteratively halving the radius until it reaches 0. Halving the radius can be done by three merges of sub-sequences of length [math]\displaystyle{ \mathtt{Rad}(X) }[/math].

Minimum of two measures

Given two measures of presortdness [math]\displaystyle{ m_1 }[/math] and [math]\displaystyle{ m_2 }[/math], the measure [math]\displaystyle{ X\mapsto\min(m_1(X), m_2(X)) }[/math] is also a measure of presortdness. Given two [math]\displaystyle{ m_i }[/math]-optimal sorting algorithms [math]\displaystyle{ S_i }[/math], the algorithm consisting in executing [math]\displaystyle{ S_1 }[/math] and [math]\displaystyle{ S_2 }[/math] in parallel and returning the first output is a [math]\displaystyle{ \min(m_1,m_2) }[/math]-optimal algorithm.

References

  1. Estivill-Castro, Vladimir; Wood, Derick (October 1989). "A new measure of presortdness". Information andComputation 83 (1): 111–119. doi:10.1016/0890-5401(89)90050-3. 
  • Mannila, H (1985). "Measures of Presortedness and Optimal Sorting Algorithms". IEEE Trans. Comput (C-34): 318–325. 

Template:Data structures and algorithms