Counting sort

From HandWiki
Revision as of 14:37, 6 February 2024 by BotanyGa (talk | contribs) (update)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Short description: Sorting algorithm
Counting sort
ClassSorting Algorithm
Data structureArray
Worst-case performance[math]\displaystyle{ O(n+k) }[/math], where k is the range of the non-negative key values.
Worst-case space complexity[math]\displaystyle{ O(n+k) }[/math]

In computer science, counting sort is an algorithm for sorting a collection of objects according to keys that are small positive integers; that is, it is an integer sorting algorithm. It operates by counting the number of objects that possess distinct key values, and applying prefix sum on those counts to determine the positions of each key value in the output sequence. Its running time is linear in the number of items and the difference between the maximum key value and the minimum key value, so it is only suitable for direct use in situations where the variation in keys is not significantly greater than the number of items. It is often used as a subroutine in radix sort, another sorting algorithm, which can handle larger keys more efficiently.Cite error: Closing </ref> missing for <ref> tag

Variant algorithms

If each item to be sorted is itself an integer, and used as key as well, then the second and third loops of counting sort can be combined; in the second loop, instead of computing the position where items with key i should be placed in the output, simply append Count[i] copies of the number i to the output.

This algorithm may also be used to eliminate duplicate keys, by replacing the Count array with a bit vector that stores a one for a key that is present in the input and a zero for a key that is not present. If additionally the items are the integer keys themselves, both second and third loops can be omitted entirely and the bit vector will itself serve as output, representing the values as offsets of the non-zero entries, added to the range's lowest value. Thus the keys are sorted and the duplicates are eliminated in this variant just by being placed into the bit array.

For data in which the maximum key size is significantly smaller than the number of data items, counting sort may be parallelized by splitting the input into subarrays of approximately equal size, processing each subarray in parallel to generate a separate count array for each subarray, and then merging the count arrays. When used as part of a parallel radix sort algorithm, the key size (base of the radix representation) should be chosen to match the size of the split subarrays.[1] The simplicity of the counting sort algorithm and its use of the easily parallelizable prefix sum primitive also make it usable in more fine-grained parallel algorithms.[2]

As described, counting sort is not an in-place algorithm; even disregarding the count array, it needs separate input and output arrays. It is possible to modify the algorithm so that it places the items into sorted order within the same array that was given to it as the input, using only the count array as auxiliary storage; however, the modified in-place version of counting sort is not stable.[3]

History

Although radix sorting itself dates back far longer, counting sort, and its application to radix sorting, were both invented by Harold H. Seward in 1954.[4][5][6]

References

  1. Zagha, Marco (1991), "Radix sort for vector multiprocessors", Proceedings of Supercomputing '91, November 18-22, 1991, Albuquerque, NM, USA, IEEE Computer Society / ACM, pp. 712–721, doi:10.1145/125826.126164, ISBN 0897914597, https://www.cs.cmu.edu/~scandal/papers/cray-sort-supercomputing91.ps.gz .
  2. "An optimal parallel algorithm for integer sorting", Proc. 26th Annual Symposium on Foundations of Computer Science (FOCS 1985), 1985, pp. 496–504, doi:10.1109/SFCS.1985.9, ISBN 0-8186-0644-4 .
  3. Cite error: Invalid <ref> tag; no text was provided for refs named sedgewick
  4. Cite error: Invalid <ref> tag; no text was provided for refs named clrs
  5. Knuth, D. E. (1998), The Art of Computer Programming, Volume 3: Sorting and Searching (2nd ed.), Addison-Wesley, ISBN 0-201-89685-0 . Section 5.2, Sorting by counting, pp. 75–80, and historical notes, p. 170.
  6. Seward, H. H. (1954), "2.4.6 Internal Sorting by Floating Digital Sort", Information sorting in the application of electronic digital computers to business operations, Master's thesis, Report R-232, Massachusetts Institute of Technology, Digital Computer Laboratory, pp. 25–28, http://bitsavers.org/pdf/mit/whirlwind/R-series/R-232_Information_Sorting_in_the_Application_of_Electronic_Digital_Computers_to_Business_Operations_May54.pdf .

External links