Tournament sort

From HandWiki
Tournament sort
ClassSorting algorithm
Data structureArray
Worst-case performanceO(n log n)
Average performanceO(n log n)

Tournament sort is a sorting algorithm. It improves upon the naive selection sort by using a priority queue to find the next element in the sort. In the naive selection sort, it takes O(n) operations to select the next element of n elements; in a tournament sort, it takes O(log n) operations (after building the initial tournament in O(n)). Tournament sort is a variation of heapsort.

Common application

Tournament replacement selection sorts are used to gather the initial runs for external sorting algorithms. Conceptually, an external file is read and its elements are pushed into the priority queue until the queue is full. Then the minimum element is pulled from the queue and written as part of the first run. The next input element is read and pushed into the queue, and the min is selected again and added to the run. There's a small trick that if the new element being pushed into the queue is less than the last element added to the run, then the element's sort value is increased so it will be part of the next run. On average, a run will be 100% longer than the capacity of the priority queue.[1]

Tournament sorts may also be used in N-way merges.

Etymology

The name comes from its similarity to a single-elimination tournament where there are many players (or teams) that play in two-sided matches. Each match compares the players, and the winning player is promoted to play a match at the next level up. The hierarchy continues until the final match determines the ultimate winner. The tournament determines the best player, but the player who was beaten in the final match may not be the second best – he may be inferior to other players the winner bested.

sorting scheme

Sort numbers 72356014 (count = 8)

7_             __             __
  \2_            \__            \__             7_          7_          __
2_/  \         __/  \         __/  \              \           \           \
3_    2-       __    2-       __    2-       __    3-          5-          5-       7-
  \3_/  \        \__/  \        \__/  \        \3_/  \      5_/  \      __/  \        \
5_/      \     __/      \     __/      \     __/      \           \           \        \
          \_0            \_1            \_2            \_3         \_4         \_5      \_67
6_        /              /              /              /           /           /        /
  \0_    /        6_    /        6_    /        __    /     __    /           /        /
0_/  \  /           \  /           \  /           \  /        \  /           /        /
1_    0-       __    1-             4-             4-          4-          6-       6-
  \1_/           \1_/            4_/            __/         __/
4_/            __/

Compares: 7 + 2 + 2 + 2 + 2 + 1 + 1 = 17

On

Onminmax(n log n) = (ln(n) / ln(2) - 1) * n + 1
On = (n - 1) + (n / 2) * (n / 4) + (n / (2 * 2) ) * (n / ( 4 * 2 )) + (n / (2 * 2 * 2) ) * (n / ( 4 * 2 * 2 )) + ...
O8 = (8 - 1) + (8 / 2) * (8 / 4) + (8 / (2 * 2) ) * (8 / ( 4 * 2 ))
O8 = 7 + 4 * 2 + 2 * 1 = 7 + 8 + 2 = 17

Implementation

The following is an implementation of tournament sort in Haskell, based on Scheme code by Stepanov and Kershenbaum.[2]

import Data.Tree

-- | Adapted from `TOURNAMENT-SORT!` in the Stepanov and Kershenbaum report.
tournamentSort :: Ord t
       => [t]  -- ^ Input: an unsorted list
       -> [t]  -- ^ Result: sorted version of the input
tournamentSort alist
        = go (pure<$>alist) -- first, wrap each element as a single-tree forest
 where go [] = []
       go trees = (rootLabel winner) : (go (subForest winner))
        where winner = playTournament trees

-- | Adapted from `TOURNAMENT!` in the Stepanov and Kershenbaum report
playTournament :: Ord t
         => Forest t -- ^ Input forest
         -> Tree t   -- ^ The last promoted tree in the input
playTournament [tree] = tree
playTournament trees = playTournament (playRound trees [])

-- | Adapted from `TOURNAMENT-ROUND!` in the Stepanov and Kershenbaum report
playRound :: Ord t
       => Forest t -- ^ A forest of trees that have not yet competed in round
       -> Forest t -- ^ A forest of trees that have won in round
       -> Forest t -- ^ Output: a forest containing promoted versions
                   --   of the trees that won their games
playRound [] done = done
playRound [tree] done = tree:done
playRound (tree0:tree1:trees) done = playRound trees (winner:done)
 where winner = playGame tree0 tree1

-- | Adapted from `TOURNAMENT-PLAY!` in the Stepanov and Kershenbaum report
playGame :: Ord t
         => Tree t  -- ^ Input: ...
         -> Tree t  -- ^ ... two trees
         -> Tree t  -- ^ Result: `promote winner loser`, where `winner` is
                    --   the tree with the *lesser* root of the two inputs
playGame tree1 tree2
    | rootLabel tree1 <= rootLabel tree2  = promote tree1 tree2
    | otherwise                           = promote tree2 tree1

-- | Adapted from `GRAB!` in the Stepanov and Kershenbaum report
promote :: Tree t  -- ^ The `winner`
        -> Tree t  -- ^ The `loser`
        -> Tree t  -- ^ Result: a tree whose root is the root of `winner`
                   --   and whose children are:
                   --   * `loser`,
                   --   * all the children of `winner`
promote winner loser = Node {
    rootLabel = rootLabel winner,
    subForest = loser : subForest winner}

main :: IO ()
main = print $ tournamentSort testList
 where testList = [0, 1, 2, 3, 4, 5]

Javascript code


// build first pyramid of minimal values
function pyramid_part1_buildPyramid(list, i_start, i_end, size, cmp, swap)
	{
	var i,j,k, k_end, lvl, lvlp1;
	var pyramid = [];
	i   = i_start;
	j   = i_start+1;
	k   = 0;
	lvl = 0;
	pyramid[lvl] = [];

	while (j<i_end)
		{
		if (cmp(list[i], list[j]))
			{swap(list, i, j);}
		pyramid[lvl][k] = i; i+=2; j+=2; k++;
		}
	if (i<i_end)	// pokud je size liche cislo, pak pridej posledni prvek a preswapuj to
			// (toho vyuziji pozdeji v part2)
		{
		if (cmp(list[i-2], list[i]))
			{
			tmp = list[i];
			list[i  ] = list[i-1];
			list[i-1] = list[i-2];
			list[i-2] = tmp; movesAdd(4);
			pyramid[lvl][k] = i;
			}
		else {if (cmp(list[i-1], list[i]))
			{
			tmp = list[i];
			list[i  ] = list[i-1];
			list[i-1] = tmp; movesAdd(3);
			}}
		}

	i_end = k;
	lvlp1 = lvl + 1;
	while (i_end>1)
		{
		pyramid[lvlp1] = [];
		k = 0;
		i = 0;
		j = 1;	// =i+1
		while (j<i_end)
			{
			if (cmp(list[ pyramid[lvl][i] ], list[ pyramid[lvl][j] ]))
				{pyramid[lvlp1][k] = pyramid[lvl][j]; i+=2; j+=2; k++; continue;}
			else	{pyramid[lvlp1][k] = pyramid[lvl][i]; i+=2; j+=2; k++; continue;}
			}
		if (i<i_end) {pyramid[lvlp1][k] = pyramid[lvl][i]; k++;}
		lvl++;
		lvlp1++;
		i_end = k;
		}
	return [pyramid, lvl, pyramid[lvl][0], (size>>1)<<1 != size];
	// return pyramid, last lvl, last index, boolean for odd-size)
	}

function pyramid_part3_rebuildPyramidEven(pyramid, lvl_end, bool, list, cmp, i_end, pos)
	{
	var lvl, val2, empty = -1, a, b;
	val2 = pyramid[0][pos];
	for (lvl=0; lvl<lvl_end; lvl++)
		{
		if ((pos & 0x01) == 0)
			{
			if (pos==pyramid[lvl].length-1)
				{
				pos = pos>>1;
				pyramid[lvl+1][pos] = val2; //val2 = val2;
				continue;
				}
			b = pyramid[lvl][pos+1];
			a = pyramid[lvl][pos];
			pos = pos>>1;
			if (b==empty)
				{pyramid[lvl+1][pos] = a; val2 = a; continue;}
			if (cmp(list[a], list[b]))
				{pyramid[lvl+1][pos] = b; val2 = b; continue;}
			pyramid[lvl+1][pos] = a; val2 = a;
			}
		else	{
			a = pyramid[lvl][pos-1];
			b = pyramid[lvl][pos];
			pos = pos>>1;
			if (a==empty)
				{pyramid[lvl+1][pos] = b; val2 = b; continue;}
			if (cmp(list[a], list[b]))
				{pyramid[lvl+1][pos] = b; val2 = b; continue;}
			pyramid[lvl+1][pos] = a; val2 = a;
			}
		}
	return [pyramid, lvl_end, pyramid[lvl_end][0], bool];
	}

// rebuild pyramid, rewrite branch by new value
function pyramid_part2_rebuildPyramid(pyramid, lvl_end, bool, list, cmp, i_end, i_endm3)
	{
	var cycles = 0;
	var lvl, pos, val, val2, a, b, empty=-1;
	val = pyramid[lvl_end][0];
	pos = val>>1;			// pozice zleva
	if (bool==true && ((pos<<1)==i_endm3) && ((val & 0x01) == 0) )
	  // kdyz je size liche cislo a dojde k eliminaci n-2, tak posun posledni 2 cisla
		{
		bool = false;
		list[val]   = list[val+1];
		list[val+1] = list[val+2]; movesAdd(2);
					// je sude, pak vymen za liche a prepocitej porovnani
					// (prepocitej vsechna nutna porovnani)
		pyramid[0][pos] = val;
		// pozn.: tento kod je prepsany na funkci, protoze by byl duplicitne
		return
		  pyramid_part3_rebuildPyramidEven(pyramid, lvl_end, bool, list, cmp, i_end, pos);
		}
	else {if ((val & 0x01) == 0)	// je sude, pak vymen za liche a prepocitej porovnani
		{
		pyramid[0][pos] = val + 1;
		return pyramid_part3_rebuildPyramidEven(pyramid, lvl_end, bool, list, cmp, i_end, pos);
		}
	else	{			// je liche, pak odstran a prepocitej porovnani
		val2 = empty;
		pyramid[0][pos] = val2;
		for (lvl=0; lvl<lvl_end; lvl++)
			{
			if ((pos & 0x01) == 0)
				{
				if (pos==pyramid[lvl].length-1)
					{
					pos = pos>>1;
					pyramid[lvl+1][pos] = val2; //val2 = val2
					continue;
					}
				a = pyramid[lvl][pos];
				b = pyramid[lvl][pos+1];
				pos = pos>>1;
				if (a!==empty && b!==empty)
					{
					if (cmp(list[a], list[b]))
						{pyramid[lvl+1][pos] = b; val2 = b; continue;}
					else	{pyramid[lvl+1][pos] = a; val2 = a; continue;}
					}
				if (b!==empty)
					{pyramid[lvl+1][pos] = b; val2 = b; continue;}
				pyramid[lvl+1][pos] = a;
				val2 = a;
				}
			else	{
				a = pyramid[lvl][pos-1];
				b = pyramid[lvl][pos];
				pos = pos>>1;
				if (a!==empty && b!==empty)
					{
					if (cmp(list[a], list[b]))
						{pyramid[lvl+1][pos] = b; val2 = b; continue;}
					else	{pyramid[lvl+1][pos] = a; val2 = a; continue;}
					}
				if (a!==empty)
					{pyramid[lvl+1][pos] = a; val2 = a; continue;}
				pyramid[lvl+1][pos] = b;
				val2 = b;
				}
			}
		}}
	return [pyramid, lvl_end, pyramid[lvl_end][0], bool];
	}

// princip: vyber minimum z kazdeho paru, pak porovnej minima, minima minim ... az ziskas nejmensi cislo
// pak vyrad nejmensi cislo z pyramidy a propocitej celou vetev, opet ziskej minimum
function PyramidSelectSort(list, start, end, cmp)
	{
	var pyramid_data, i, x, y, endm3 = end-3, size = end - start;
	x = list;
	y = [];
	pyramid_data = pyramid_part1_buildPyramid(x, start, end, size, cmp, swap);
	  // create pyramid of index from minimal values of pair
	i = start;
	y[i] = x[pyramid_data[2]]; movesAdd(1);
	i++;
	while (i<end)
		{
		pyramid_data = pyramid_part2_rebuildPyramid(
			pyramid_data[0], pyramid_data[1], pyramid_data[3], x, cmp, end, endm3);
		y[i] = x[pyramid_data[2]]; movesAdd(1);
		i++;
		}
	return y;
	}

// list = PyramidSelectSort(list, start, end, cmp);

// ---

// only for statistic info about moves, cycles, compares
function movesAdd(a) {glob.moves += a;};
function cyclesAdd() {glob.cycles++;};
function comparesAdd() {glob.cmps++;};

function cmp(a, b) {comparesAdd(); return a>b;}

// ---

list = [7,7,4,3,4,7,6,7,0,1,0,6,7,2,2,4];
	var glob = {time: 0, cmps: 0, cycles: 0, moves: 0, swaps: 0, start: 0, end: list.length, size: 0};
	glob.size = glob.end - glob.start;

list = PyramidSelectSort(list, glob.start, glob.end, cmp);

/*
stable
fast
need list.size memory for state (true/false)
need list.size memory for new array
*/

References

  1. Donald Knuth, The Art of Computer Programming, Sorting and Searching, Volume 3, 1973. The "snowplow" argument. p. 254
  2. Template:Cite tech report