Patience sorting

From HandWiki
Short description: Sorting algorithm
Patience sorting
ClassSorting algorithm
Data structureArray
Worst-case performanceO(n log n)
Best-case performanceO(n); occurs when the input is pre-sorted[1]

In computer science, patience sorting is a sorting algorithm inspired by, and named after, the card game patience. A variant of the algorithm efficiently computes the length of a longest increasing subsequence in a given array.

Overview

The algorithm's name derives from a simplified variant of the patience card game. The game begins with a shuffled deck of cards. The cards are dealt one by one into a sequence of piles on the table, according to the following rules.[2]

  1. Initially, there are no piles. The first card dealt forms a new pile consisting of the single card.
  2. Each subsequent card is placed on the leftmost existing pile whose top card has a value greater than or equal to the new card's value, or to the right of all of the existing piles, thus forming a new pile.
  3. When there are no more cards remaining to deal, the game ends.

This card game is turned into a two-phase sorting algorithm, as follows. Given an array of n elements from some totally ordered domain, consider this array as a collection of cards and simulate the patience sorting game. When the game is over, recover the sorted sequence by repeatedly picking off the minimum visible card; in other words, perform a k-way merge of the p piles, each of which is internally sorted.

Analysis

The first phase of patience sort, the card game simulation, can be implemented to take O(n log n) comparisons in the worst case for an n-element input array: there will be at most n piles, and by construction, the top cards of the piles form an increasing sequence from left to right, so the desired pile can be found by binary search.[1] The second phase, the merging of piles, can be done in [math]\displaystyle{ O(n\log n) }[/math] time as well using a priority queue.[1]

When the input data contain natural "runs", i.e., non-decreasing subarrays, then performance can be strictly better. In fact, when the input array is already sorted, all values form a single pile and both phases run in O(n) time. The average-case complexity is still O(n log n): any uniformly random sequence of values will produce an expected number of [math]\displaystyle{ O(\sqrt{n}) }[/math] piles,[3] which take [math]\displaystyle{ O(n\log\sqrt{n}) = O(n\log n) }[/math] time to produce and merge.[1]

An evaluation of the practical performance of patience sort is given by Chandramouli and Goldstein, who show that a naive version is about ten to twenty times slower than a state-of-the-art quicksort on their benchmark problem. They attribute this to the relatively small amount of research put into patience sort, and develop several optimizations that bring its performance to within a factor of two of that of quicksort.[1]

If values of cards are in the range 1, . . . , n, there is an efficient implementation with [math]\displaystyle{ O(n\log n) }[/math] worst-case running time for putting the cards into piles, relying on a Van Emde Boas tree.[3]

Relations to other problems

Patience sorting is closely related to a card game called Floyd's game. This game is very similar to the game sketched earlier:[2]

  1. The first card dealt forms a new pile consisting of the single card.
  2. Each subsequent card is placed on some existing pile whose top card has a value no less than the new card's value, or to the right of all of the existing piles, thus forming a new pile.
  3. When there are no more cards remaining to deal, the game ends.

The object of the game is to finish with as few piles as possible. The difference with the patience sorting algorithm is that there is no requirement to place a new card on the leftmost pile where it is allowed. Patience sorting constitutes a greedy strategy for playing this game.

Aldous and Diaconis suggest defining 9 or fewer piles as a winning outcome for n = 52, which happens with approximately 5% probability.[4]

Algorithm for finding a longest increasing subsequence

First, execute the sorting algorithm as described above. The number of piles is the length of a longest subsequence. Whenever a card is placed on top of a pile, put a back-pointer to the top card in the previous pile (that, by assumption, has a lower value than the new card has). In the end, follow the back-pointers from the top card in the last pile to recover a decreasing subsequence of the longest length; its reverse is an answer to the longest increasing subsequence algorithm.

S. Bespamyatnikh and M. Segal[3] give a description of an efficient implementation of the algorithm, incurring no additional asymptotic cost over the sorting one (as the back-pointers storage, creation and traversal require linear time and space). They further show how to report all the longest increasing subsequences from the same resulting data structures.

History

Patience sorting was named by C. L. Mallows, who attributed its invention to A.S.C. Ross in the early 1960s.[1] According to Aldous and Diaconis,[4] patience sorting was first recognized as an algorithm to compute the longest increasing subsequence length by Hammersley.[5] A.S.C. Ross and independently Robert W. Floyd recognized it as a sorting algorithm. Initial analysis was done by Mallows.[6] Floyd's game was developed by Floyd in correspondence with Donald Knuth.[2]

Use

The patience sorting algorithm can be applied to process control. Within a series of measurements, the existence of a long increasing subsequence can be used as a trend marker. A 2002 article in SQL Server magazine includes a SQL implementation, in this context, of the patience sorting algorithm for the length of the longest increasing subsequence.[7]

References

  1. 1.0 1.1 1.2 1.3 1.4 1.5 Chandramouli, Badrish; Goldstein, Jonathan (2014). "Patience is a Virtue: Revisiting Merge and Sort on Modern Processors". SIGMOD/PODS. https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/patsort-sigmod14.pdf. 
  2. 2.0 2.1 2.2 Burstein, Alexander; Lankham, Isaiah (2006). "Combinatorics of patience sorting piles". Séminaire Lotharingien de Combinatoire 54A. Bibcode2005math......6358B. http://www.emis.de/journals/SLC/wpapers/s54Aburlank.pdf. 
  3. 3.0 3.1 3.2 Bespamyatnikh, Sergei; Segal, Michael (2000). "Enumerating Longest Increasing Subsequences and Patience Sorting". Information Processing Letters 76 (1–2): 7–11. doi:10.1016/s0020-0190(00)00124-1. 
  4. 4.0 4.1 Aldous, David; Diaconis, Persi (1999). "Longest increasing subsequences: from patience sorting to the Baik-Deift-Johansson theorem". Bulletin of the American Mathematical Society. New Series 36 (4): 413–432. doi:10.1090/s0273-0979-99-00796-x. http://www-stat.stanford.edu/~cgates/PERSI/year.html#99. 
  5. Hammersley, John (1972). "A few seedlings of research". Proc. Sixth Berkeley Symp. Math. Statist. and Probability. 1. University of California Press. pp. 345–394. 
  6. Mallows, C. L. (1973). "Patience sorting". Bull. Inst. Math. Appl. 9: 216–224. 
  7. Kass, Steve (April 30, 2002). "Statistical Process Control". SQL Server Pro. http://sqlmag.com/t-sql/statistical-process-control. Retrieved 23 April 2014.