Implicit data structure

From HandWiki
Revision as of 22:23, 8 February 2024 by Steve Marsio (talk | contribs) (simplify)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

In computer science, an implicit data structure or space-efficient data structure is a data structure that stores very little information other than the main or required data: a data structure that requires low overhead. They are called "implicit" because the position of the elements carries meaning and relationship between elements; this is contrasted with the use of pointers to give an explicit relationship between elements. Definitions of "low overhead" vary, but generally means constant overhead; in big O notation, O(1) overhead. A less restrictive definition is a succinct data structure, which allows greater overhead.

Definition

An implicit data structure is one with constant O(1) space overhead (above the information-theoretic lower bound).

Historically, (Munro Suwanda) defined an implicit data structure (and algorithms acting on one) as one "in which structural information is implicit in the way data are stored, rather than explicit in pointers." They are somewhat vague in the definition, defining it most strictly as a single array, with only the size retained (a single number of overhead),[1] or more loosely as a data structure with constant overhead (O(1)).[2] This latter definition is today more standard, and the still-looser notion of a data structure with non-constant but small o(n) overhead is today known as a succinct data structure, as defined by (Jacobson 1988); it was referred to as semi-implicit by (Munro Suwanda).[3]

A fundamental distinction is between static data structures (read-only) and dynamic data structures (which can be modified). Simple implicit data structures, such as representing a sorted list as an array, may be very efficient as a static data structure, but inefficient as a dynamic data structure, due to modification operations (such as insertion in the case of a sorted list) being inefficient.

Examples

A trivial example of an implicit data structure is an array data structure, which is an implicit data structure for a list, and requires only the constant overhead of the length; unlike a linked list, which has a pointer associated with each data element, which explicitly gives the relationship from one element to the next. Similarly, a null-terminated string is an implicit data structure for a string (list of characters). These are considered very simple because they are static data structures (read-only), and only admit the simple operation of iteration over the elements.

Similarly simple is representing a multi-dimensional array as a single 1-dimensional array, together with its dimensions. For example, representing an m × n array as a single list of length m·n, together with the numbers m and n (instead of as a 1-dimensional array of pointers to each 1-dimensional subarray). The elements need not be of the same type, and a table of data (a list of records) may similarly be represented implicitly as a flat (1-dimensional) list, together with the length of each field, so long as each field has uniform size (so a single size can be used per field, not per record).

A less trivial example is representing a sorted list by a sorted array, which allows search in logarithmic time by binary search. Contrast with a search tree, specifically a binary search tree, which also allows logarithmic-time search, but requires pointers. A sorted array is only efficient as a static data structure, as modifying the list is slow – unlike a binary search tree – but does not require the space overhead of a tree.

An important example of an implicit data structure is representing a perfect binary tree as a list, in increasing order of depth, so root, first left child, first right child, first left child of first left child, etc. Such a tree occurs notably for an ancestry chart to a given depth, and the implicit representation is known as an Ahnentafel (ancestor table).

This can be generalized to a complete binary tree (where the last level may be incomplete), which yields the best-known example of an implicit data structure, namely the binary heap, which is an implicit data structure for a priority queue. This is more sophisticated than earlier examples because it allows multiple operations, and is an efficient dynamic data structure (it allows efficient modification of the data): not only top, but also insert and pop.

More sophisticated implicit data structures include the beap (bi-parental heap).

History

The trivial examples of lists or tables of values date to prehistory, while historically non-trivial implicit data structures date at least to the Ahnentafel, which was introduced by Michaël Eytzinger in 1590 for use in genealogy. In formal computer science, the first implicit data structure is generally considered to be the sorted list, used for binary search, which was introduced by John Mauchly in 1946, in the Moore School Lectures, the first ever set of lectures regarding any computer-related topic.[4][5] The binary heap was introduced in (Williams 1964) to implement the heapsort.[5] The notion of an implicit data structure was formalized in (Munro Suwanda), as part of introducing and analyzing the beap.[5]

References

  1. "Thus, only a simple array is needed for the data.", p. 236; "We will draw no formal distinction between a pointer and an integer (index) in the range [math]\displaystyle{ [0, N] }[/math]. A data structure is then implicit, if the only such integer which need be retained is N itself.", p. 238
  2. "... one might prefer to permit a constant number of pointers to be retained and still designate the structure as implicit.", p. 238
  3. "We will also suggest two structures which might be described as “semi-implicit,” in that a variable, but o(N), number of pointers (indices) is kept.", p. 238
  4. Knuth 1998, §6.2.1 ("Searching an ordered table"), subsection "History and bibliography".
  5. Jump up to: 5.0 5.1 5.2 Franceschini, Gianni; Munro, J. Ian (2006). "Implicit dictionaries with O(1) modifications per update and fast search". Seventeenth Annual ACM-SIAM Symposium on Discrete Algorithms. Miami, FL, United States. pp. 404–413. doi:10.1145/1109557.1109603. 
  • Jacobson, G. J (1988). Succinct static data structures (Ph.D.). Pittsburgh, PA: Carnegie Mellon University.
  • Knuth, Donald (1998). Sorting and searching. The Art of Computer Programming. 3 (2nd ed.). Reading, MA: Addison-Wesley Professional. ISBN 978-0-201-89685-5. 
  • Munro, J.Ian; Suwanda, Hendra (October 1980). "Implicit data structures for fast search and update". Journal of Computer and System Sciences 21 (2): 236–250. doi:10.1016/0022-0000(80)90037-9. 
  • Williams, J. W. J. (June 1964). "Algorithm 232 - Heapsort". Communications of the ACM 7 (6): 347–348. doi:10.1145/512274.512284. 

Further reading

See publications of Hervé Brönnimann, J. Ian Munro, and Greg Frederickson.