qsort

From HandWiki

qsort is a C standard library function that implements a sorting algorithm for arrays of arbitrary objects according to a user-provided comparison function. It is named after the "quicker sort" algorithm[1] (a quicksort variant due to R. S. Scowen), which was originally used to implement it in the Unix C library, although the C standard does not require it to implement quicksort.[2]

The ability to operate on different kinds of data (polymorphism) is achieved by taking a function pointer to a three-way comparison function, as well as a parameter that specifies the size of its individual input objects. The C standard requires the comparison function to implement a total order on the items in the input array.[3]

History

A qsort function appears in Version 2 Unix in 1972 as a library assembly language subroutine. Its interface is unlike the modern version, in that it can be pseudo-prototyped as qsort(void * start, void * end, unsigned length) – sorting contiguously-stored length-long byte strings from the range [start, end).[1] This, and the lack of a replaceable comparison function, makes it unsuitable to properly sort the system's little-endian integers, or any other data structures.

In Version 3 Unix, the interface is extended by calling compar(III), with an interface identical to modern-day memcmp. This function may be overriden by the user's program to implement any kind of ordering, in an equivalent fashion to the compar argument to standard qsort (though program-global, of course).[4]

Version 4 Unix adds a C implementation, with an interface equivalent to the standard.[5] It was rewritten in 1983 for the Berkeley Software Distribution.[2] The function was standardized in ANSI C (1989). The assembly implementation is removed in Version 6 Unix.[6]

In 1991, Bell Labs employees observed that AT&T and BSD versions of qsort would consume quadratic time for some simple inputs. Thus Jon Bentley and Douglas McIlroy engineered a new faster and more robust implementation.[2] McIlroy would later produce a more complex quadratic-time input, termed AntiQuicksort, in 1998. This function constructs adversary data on-the-fly.[7]

Example

The following piece of C code shows how to sort a list of integers using qsort.

#include <stdlib.h>

/* Comparison function. Receives two generic (void) pointers to the items under comparison. */
int compare_ints(const void *p, const void *q) {
    int x = *(const int *)p;
    int y = *(const int *)q;

    /* Avoid return x - y, which can cause undefined behaviour
       because of signed integer overflow. */
    if (x < y)
        return -1;  // Return -1 if you want ascending, 1 if you want descending order. 
    else if (x > y)
        return 1;   // Return 1 if you want ascending, -1 if you want descending order.

    return 0;
    // All the logic is often alternatively written:
    return (x > y) - (x < y);
}

/* Sort an array of n integers, pointed to by a. */
void sort_ints(int *a, size_t n) {
    qsort(a, n, sizeof(*a), compare_ints);
}

Extensions

Since the comparison function of the original qsort only accepts two pointers, passing in additional parameters (e.g. producing a comparison function that compares by the two value's difference with another value) must be done using global variables. The issue was solved by the BSD and GNU Unix-like systems by a introducing qsort_r function, which allows for an additional parameter to be passed to the comparison function. The two versions of qsort_r have different argument orders. C11 Annex K defines a qsort_s essentially identical to GNU's qsort_r. The macOS and FreeBSD libcs also contain qsort_b, a variant that uses blocks, an analogue to closures, as an alternate solution to the same problem.[8]

References