Interning (computer science)

From HandWiki
Short description: Data structure for reusing strings


In computer science, interning is re-using objects of equal value on-demand instead of creating new objects. This creational pattern[1] is frequently used for numbers and strings in different programming languages. In many object-oriented languages such as Python, even primitive types such as integer numbers are objects. To avoid the overhead of constructing a large number of integer objects, these objects get reused through interning.[2]

For interning to work the interned objects must be immutable, since state is shared between multiple variables. String interning is a common application of interning, where many strings with identical values are needed in the same program.

History

Lisp introduced the notion of interned strings for its symbols. The LISP 1.5 Programmers Manual [3] describes a function called intern which either evaluates to an existing symbol of the supplied name, or if none exists, creates a new symbol of that name. This idea of interned symbols persists in more recent dialects of Lisp, such as Clojure in special forms such a (def symbol) which perform symbol creation and interning.[4]

In the object-oriented programming paradigm interning is an important mechanism in the flyweight pattern, where an interning method is called to store the intrinsic state of an object such that this can be shared among different objects which share different extrinsic state, avoiding needless duplication.[5]

Interning continues to be an important technique for managing memory use in programming language implementations; for example, the Java Language Specification requires that identical string literals (that is, literals that contain the same sequence of code points) must refer to the same instance of class String, because string literals are "interned" so as to share unique instances.[6] In the Python programming language small integers are interned,[7] though the details of exactly which are dependent on language version.

Motivation

Interning saves memory and can thus improve performance and memory footprint of a program.[8] The downside is time required to search for existing values of objects which are to be interned.

See also

References

  1. "Design Patterns". https://courses.cs.washington.edu/courses/cse331/11wi/lectures/lect09-design-patterns-1.pdf. 
  2. Jaworski, Michał (2019). Expert Python programming : become a master in Python by learning coding best practices and advanced programming concepts in Python 3.7. Tarek Ziadé (Third ed.). Birmingham, U.K.. ISBN 978-1-78980-677-9. OCLC 1125343555. https://www.worldcat.org/oclc/1125343555. 
  3. Levin, Michael I. (1965). LISP 1.5 programmer's manual : the Computation Center and Research Laboratory of Electronics, Massachusetts Institute of Technology. John McCarthy, Massachusetts Institute of Technology. Computation Center, Massachusetts Institute of Technology. Research Laboratory of Electronics (2nd ed.). Cambridge: M.I.T. Press. ISBN 0-262-13011-4. OCLC 1841373. https://www.worldcat.org/oclc/1841373. 
  4. "Clojure - Special Forms". https://clojure.org/reference/special_forms#def. 
  5. Design patterns : elements of reusable object-oriented software. Erich Gamma, Richard Helm, Ralph E. Johnson, John Vlissides. Reading, Mass.: Addison-Wesley. 1995. ISBN 0-201-63361-2. OCLC 31171684. https://www.worldcat.org/oclc/31171684. 
  6. "Java Language Specification. Chapter 3. Lexical Structure". https://docs.oracle.com/javase/specs/jls/se7/html/jls-3.html#jls-3.10.5. 
  7. "PEP 237 -- Unifying Long Integers and Integers" (in en). https://www.python.org/dev/peps/pep-0237/. 
  8. Oaks, Scott (2014). Java performance : the definitive guide. Sebastopol, CA: O'Reilly Media. ISBN 978-1-4493-6354-3. OCLC 878059649. https://www.worldcat.org/oclc/878059649. 

External links