Hash table
Hash table  

Type  Unordered associative array  
Invented  1953  
Time complexity in big O notation  

In computing, a hash table, also known as a hash map, is a data structure that implements an associative array, also called a dictionary, which is an abstract data type that maps keys to values.^{[2]} A hash table uses a hash function to compute an index, also called a hash code, into an array of buckets or slots, from which the desired value can be found. During lookup, the key is hashed and the resulting hash indicates where the corresponding value is stored.
Ideally, the hash function will assign each key to a unique bucket, but most hash table designs employ an imperfect hash function, which might cause hash collisions where the hash function generates the same index for more than one key. Such collisions are typically accommodated in some way.
In a welldimensioned hash table, the average time complexity for each lookup is independent of the number of elements stored in the table. Many hash table designs also allow arbitrary insertions and deletions of key–value pairs, at amortized constant average cost per operation.^{[3]}^{[4]}^{[5]}
Hashing is an example of a spacetime tradeoff. If memory is infinite, the entire key can be used directly as an index to locate its value with a single memory access. On the other hand, if infinite time is available, values can be stored without regard for their keys, and a binary search or linear search can be used to retrieve the element.^{[6]}^{:458}
In many situations, hash tables turn out to be on average more efficient than search trees or any other table lookup structure. For this reason, they are widely used in many kinds of computer software, particularly for associative arrays, database indexing, caches, and sets.
History
The idea of hashing arose independently in different places. In January 1953, Hans Peter Luhn wrote an internal IBM memorandum that used hashing with chaining. The first example of open addressing was proposed by A. D. Linh, building on Luhn's memorandum.^{[7]}^{:{{{1}}}} Around the same time, Gene Amdahl, Elaine M. McGraw, Nathaniel Rochester, and Arthur Samuel of IBM Research implemented hashing for the IBM 701 assembler.^{[8]}^{:124} Open addressing with linear probing is credited to Amdahl, although Andrey Ershov independently had the same idea.^{[8]}^{:{{{1}}}} The term "open addressing" was coined by W. Wesley Peterson on his article which discusses the problem of search in large files.^{[7]}^{:15}
The first published work on hashing with chaining is credited to Arnold Dumey, who discussed the idea of using remainder modulo a prime as a hash function.^{[7]}^{:15} The word "hashing" was first published in an article by Robert Morris.^{[8]}^{:126} A theoretical analysis of linear probing was submitted originally by Konheim and Weiss.^{[7]}^{:15}
Overview
An associative array stores a set of (key, value) pairs and allows insertion, deletion, and lookup (search), with the constraint of unique keys. In the hash table implementation of associative arrays, an array [math]\displaystyle{ A }[/math] of length [math]\displaystyle{ m }[/math] is partially filled with [math]\displaystyle{ n }[/math] elements, where [math]\displaystyle{ m \ge n }[/math]. A value [math]\displaystyle{ x }[/math] gets stored at an index location [math]\displaystyle{ A[h(x)] }[/math], where [math]\displaystyle{ h }[/math] is a hash function, and [math]\displaystyle{ h(x) \lt m }[/math].^{[7]}^{:2} Under reasonable assumptions, hash tables have better time complexity bounds on search, delete, and insert operations in comparison to selfbalancing binary search trees.^{[7]}^{:1}
Hash tables are also commonly used to implement sets, by omitting the stored value for each key and merely tracking whether the key is present.^{[7]}^{:1}
Load factor
A load factor [math]\displaystyle{ \alpha }[/math] is a critical statistic of a hash table, and is defined as follows:^{[1]} [math]\displaystyle{ \text{load factor}\ (\alpha) = \frac{n}{m}, }[/math] where
 [math]\displaystyle{ n }[/math] is the number of entries occupied in the hash table.
 [math]\displaystyle{ m }[/math] is the number of buckets.
The performance of the hash table deteriorates in relation to the load factor [math]\displaystyle{ \alpha }[/math].^{[7]}^{:2}
To maintain good performance, the software makes sure the load factor [math]\displaystyle{ \alpha }[/math] never exceeds some constant [math]\displaystyle{ \alpha_{\max} }[/math].^{[9]}
Therefore a hash table is resized or rehashed whenever the load factor [math]\displaystyle{ \alpha }[/math] reaches [math]\displaystyle{ \alpha_{\max} }[/math].^{[9]}
A table is also resized if the load factor drops below [math]\displaystyle{ \alpha_{\max}/4 }[/math].^{[9]}
Load factor for separate chaining
With separate chaining hash tables, each slot of the bucket array stores a pointer to a list or array of data.^{[10]}
Separate chaining hash tables suffer gradually declining performance as the load factor grows, and no fixed point beyond which resizing is absolutely needed.^{[9]}
With separate chaining, the value of [math]\displaystyle{ \alpha_{\max} }[/math] that gives best performance is typically between 1 and 3.^{[9]}
Load factor for open addressing
With open addressing, each slot of the bucket array holds exactly one item. Therefore an openaddressed hash table cannot have a load factor greater than 1.^{[10]}
The performance of open addressing becomes very bad when the load factor approaches 1.^{[9]}
Therefore a hash table that uses open addressing must be resized or rehashed if the load factor [math]\displaystyle{ \alpha }[/math] approaches 1.^{[9]}
With open addressing, acceptable figures of max load factor [math]\displaystyle{ \alpha_{\max} }[/math] should range around 0.6 to 0.75.^{[11]}^{[12]}^{:110}
Hash function
A hash function [math]\displaystyle{ h }[/math] maps the universe [math]\displaystyle{ U }[/math] of keys [math]\displaystyle{ h : U \rightarrow \{0, ..., m1\} }[/math] to array indices or slots within the table for each [math]\displaystyle{ h(x) \in {0, ..., m1} }[/math] where [math]\displaystyle{ x \in S }[/math] and [math]\displaystyle{ m \lt n }[/math]. The conventional implementations of hash functions are based on the integer universe assumption that all elements of the table stem from the universe [math]\displaystyle{ U = \{0, ..., u  1\} }[/math], where the bit length of [math]\displaystyle{ u }[/math] is confined within the word size of a computer architecture.^{[7]}^{:2}
A perfect hash function [math]\displaystyle{ h }[/math] is defined as an injective function such that each element [math]\displaystyle{ x }[/math] in [math]\displaystyle{ S }[/math] maps to a unique value in [math]\displaystyle{ {0, ..., m1} }[/math].^{[13]}^{[14]} A perfect hash function can be created if all the keys are known ahead of time.^{[13]}
Integer universe assumption
The schemes of hashing used in integer universe assumption include hashing by division, hashing by multiplication, universal hashing, dynamic perfect hashing, and static perfect hashing.^{[7]}^{:2} However, hashing by division is the commonly used scheme.^{[15]}^{:264}^{[12]}^{:{{{1}}}}
Hashing by division
The scheme in hashing by division is as follows:^{[7]}^{:2} [math]\displaystyle{ h(x)\ =\ M\, \bmod\, m }[/math] Where [math]\displaystyle{ M }[/math] is the hash digest of [math]\displaystyle{ x \in S }[/math] and [math]\displaystyle{ m }[/math] is the size of the table.
Hashing by multiplication
The scheme in hashing by multiplication is as follows:^{[7]}^{:23} [math]\displaystyle{ h(x) = \lfloor m \bigl((M A) \bmod 1\bigr) \rfloor }[/math] Where [math]\displaystyle{ A }[/math] is a realvalued constant and [math]\displaystyle{ m }[/math] is the size of the table. An advantage of the hashing by multiplication is that the [math]\displaystyle{ m }[/math] is not critical.^{[7]}^{:23} Although any value [math]\displaystyle{ A }[/math] produces a hash function, Donald Knuth suggests using the golden ratio.^{[7]}^{:3}
Choosing a hash function
Uniform distribution of the hash values is a fundamental requirement of a hash function. A nonuniform distribution increases the number of collisions and the cost of resolving them. Uniformity is sometimes difficult to ensure by design, but may be evaluated empirically using statistical tests, e.g., a Pearson's chisquared test for discrete uniform distributions.^{[16]}^{[17]}
The distribution needs to be uniform only for table sizes that occur in the application. In particular, if one uses dynamic resizing with exact doubling and halving of the table size, then the hash function needs to be uniform only when the size is a power of two. Here the index can be computed as some range of bits of the hash function. On the other hand, some hashing algorithms prefer to have the size be a prime number.^{[18]}
For open addressing schemes, the hash function should also avoid clustering, the mapping of two or more keys to consecutive slots. Such clustering may cause the lookup cost to skyrocket, even if the load factor is low and collisions are infrequent. The popular multiplicative hash is claimed to have particularly poor clustering behavior.^{[18]}^{[4]}
Kindependent hashing offers a way to prove a certain hash function does not have bad keysets for a given type of hashtable. A number of Kindependence results are known for collision resolution schemes such as linear probing and cuckoo hashing. Since Kindependence can prove a hash function works, one can then focus on finding the fastest possible such hash function.^{[19]}
Collision resolution
A search algorithm that uses hashing consists of two parts. The first part is computing a hash function which transforms the search key into an array index. The ideal case is such that no two search keys hashes to the same array index. However, this is not always the case and is impossible to guarantee for unseen given data.^{[20]}^{:{{{1}}}} Hence the second part of the algorithm is collision resolution. The two common methods for collision resolution are separate chaining and open addressing.^{[6]}^{:{{{1}}}}
Separate chaining
In separate chaining, the process involves building a linked list with key–value pair for each search array index. The collided items are chained together through a single linked list, which can be traversed to access the item with a unique search key.^{[6]}^{:464} Collision resolution through chaining with linked list is a common method of implementation of hash tables. Let [math]\displaystyle{ T }[/math] and [math]\displaystyle{ x }[/math] be the hash table and the node respectively, the operation involves as follows:^{[15]}^{:{{{1}}}}
ChainedHashInsert(T, k) insert x at the head of linked list T[h(k)] ChainedHashSearch(T, k) search for an element with key k in linked list T[h(k)] ChainedHashDelete(T, k) delete x from the linked list T[h(k)]
If the element is comparable either numerically or lexically, and inserted into the list by maintaining the total order, it results in faster termination of the unsuccessful searches.^{[20]}^{:520521}
Other data structures for separate chaining
If the keys are ordered, it could be efficient to use "selforganizing" concepts such as using a selfbalancing binary search tree, through which the theoretical worst case could be brought down to [math]\displaystyle{ O(\log{n}) }[/math], although it introduces additional complexities.^{[20]}^{:521}
In dynamic perfect hashing, twolevel hash tables are used to reduce the lookup complexity to be a guaranteed [math]\displaystyle{ O(1) }[/math] in the worst case. In this technique, the buckets of [math]\displaystyle{ k }[/math] entries are organized as perfect hash tables with [math]\displaystyle{ k^2 }[/math] slots providing constant worstcase lookup time, and low amortized time for insertion.^{[21]} A study shows arraybased separate chaining to be 97% more performant when compared to the standard linked list method under heavy load.^{[22]}^{:99}
Techniques such as using fusion tree for each buckets also result in constant time for all operations with high probability.^{[23]}
Caching and locality of reference
The linked list of separate chaining implementation may not be cacheconscious due to spatial locality—locality of reference—when the nodes of the linked list are scattered across memory, thus the list traversal during insert and search may entail CPU cache inefficiencies.^{[22]}^{:{{{1}}}}
In cacheconscious variants of collision resolution through separate chaining, a dynamic array found to be more cachefriendly is used in the place where a linked list or selfbalancing binary search trees is usually deployed, since the contiguous allocation pattern of the array could be exploited by hardwarecache prefetchers—such as translation lookaside buffer—resulting in reduced access time and memory consumption.^{[24]}^{[25]}^{[26]}
Open addressing
Open addressing is another collision resolution technique in which every entry record is stored in the bucket array itself, and the hash resolution is performed through probing. When a new entry has to be inserted, the buckets are examined, starting with the hashedto slot and proceeding in some probe sequence, until an unoccupied slot is found. When searching for an entry, the buckets are scanned in the same sequence, until either the target record is found, or an unused array slot is found, which indicates an unsuccessful search.^{[27]}
Wellknown probe sequences include:
 Linear probing, in which the interval between probes is fixed (usually 1).^{[28]}
 Quadratic probing, in which the interval between probes is increased by adding the successive outputs of a quadratic polynomial to the value given by the original hash computation.^{[29]}^{:272}
 Double hashing, in which the interval between probes is computed by a secondary hash function.^{[29]}^{:272273}
The performance of open addressing may be slower compared to separate chaining since the probe sequence increases when the load factor [math]\displaystyle{ \alpha }[/math] approaches 1.^{[9]}^{[22]}^{:93} The probing results in an infinite loop if the load factor reaches 1, in the case of a completely filled table.^{[6]}^{:471} The average cost of linear probing depends on the hash function's ability to distribute the elements uniformly throughout the table to avoid clustering, since formation of clusters would result in increased search time.^{[6]}^{:472}
Caching and locality of reference
Since the slots are located in successive locations, linear probing could lead to better utilization of CPU cache due to locality of references resulting in reduced memory latency.^{[28]}
Other collision resolution techniques based on open addressing
Coalesced hashing
Coalesced hashing is a hybrid of both separate chaining and open addressing in which the buckets or nodes link within the table.^{[30]}^{:{{{1}}}} The algorithm is ideally suited for fixed memory allocation.^{[30]}^{:4} The collision in coalesced hashing is resolved by identifying the largestindexed empty slot on the hash table, then the colliding value is inserted into that slot. The bucket is also linked to the inserted node's slot which contains its colliding hash address.^{[30]}^{:8}
Cuckoo hashing
Cuckoo hashing is a form of open addressing collision resolution technique which guarantees [math]\displaystyle{ O(1) }[/math] worstcase lookup complexity and constant amortized time for insertions. The collision is resolved through maintaining two hash tables, each having its own hashing function, and collided slot gets replaced with the given item, and the preoccupied element of the slot gets displaced into the other hash table. The process continues until every key has its own spot in the empty buckets of the tables; if the procedure enters into infinite loop—which is identified through maintaining a threshold loop counter—both hash tables get rehashed with newer hash functions and the procedure continues.^{[31]}^{:{{{1}}}}
Hopscotch hashing
Hopscotch hashing is an open addressing based algorithm which combines the elements of cuckoo hashing, linear probing and chaining through the notion of a neighbourhood of buckets—the subsequent buckets around any given occupied bucket, also called a "virtual" bucket.^{[32]}^{:{{{1}}}} The algorithm is designed to deliver better performance when the load factor of the hash table grows beyond 90%; it also provides high throughput in concurrent settings, thus well suited for implementing resizable concurrent hash table.^{[32]}^{:350} The neighbourhood characteristic of hopscotch hashing guarantees a property that, the cost of finding the desired item from any given buckets within the neighbourhood is very close to the cost of finding it in the bucket itself; the algorithm attempts to be an item into its neighbourhood—with a possible cost involved in displacing other items.^{[32]}^{:352}
Each bucket within the hash table includes an additional "hopinformation"—an Hbit bit array for indicating the relative distance of the item which was originally hashed into the current virtual bucket within H1 entries.^{[32]}^{:352} Let [math]\displaystyle{ k }[/math] and [math]\displaystyle{ Bk }[/math] be the key to be inserted and bucket to which the key is hashed into respectively; several cases are involved in the insertion procedure such that the neighbourhood property of the algorithm is vowed:^{[32]}^{:352353} if [math]\displaystyle{ Bk }[/math] is empty, the element is inserted, and the leftmost bit of bitmap is set to 1; if not empty, linear probing is used for finding an empty slot in the table, the bitmap of the bucket gets updated followed by the insertion; if the empty slot is not within the range of the neighbourhood, i.e. H1, subsequent swap and hopinfo bit array manipulation of each bucket is performed in accordance with its neighbourhood invariant properties.^{[32]}^{:353}
Robin Hood hashing
Robin Hood hashing is an open addressing based collision resolution algorithm; the collisions are resolved through favouring the displacement of the element that is farthest—or longest probe sequence length (PSL)—from its "home location" i.e. the bucket to which the item was hashed into.^{[33]}^{:{{{1}}}} Although Robin Hood hashing does not change the theoretical search cost, it significantly affects the variance of the distribution of the items on the buckets,^{[34]}^{:{{{1}}}} i.e. dealing with cluster formation in the hash table.^{[35]} Each node within the hash table that uses Robin Hood hashing should be augmented to store an extra PSL value.^{[36]} Let [math]\displaystyle{ x }[/math] be the key to be inserted, [math]\displaystyle{ x.psl }[/math] be the (incremental) PSL length of [math]\displaystyle{ x }[/math], [math]\displaystyle{ T }[/math] be the hash table and [math]\displaystyle{ j }[/math] be the index, the insertion procedure is as follows:^{[33]}^{:1213}^{[37]}^{:{{{1}}}}
 If [math]\displaystyle{ x.psl\ \le\ T[j].psl }[/math]: the iteration goes into the next bucket without attempting an external probe.
 If [math]\displaystyle{ x.psl\ \gt \ T[j].psl }[/math]: insert the item [math]\displaystyle{ x }[/math] into the bucket [math]\displaystyle{ j }[/math]; swap [math]\displaystyle{ x }[/math] with [math]\displaystyle{ T[j] }[/math]—let it be [math]\displaystyle{ x' }[/math]; continue the probe from the [math]\displaystyle{ j+1 }[/math]st bucket to insert [math]\displaystyle{ x' }[/math]; repeat the procedure until every element is inserted.
Dynamic resizing
Repeated insertions cause the number of entries in a hash table to grow, which consequently increases the load factor; to maintain the amortized [math]\displaystyle{ O(1) }[/math] performance of the lookup and insertion operations, a hash table is dynamically resized and the items of the tables are rehashed into the buckets of the new hash table,^{[9]} since the items cannot be copied over as varying table sizes results in different hash value due to modulo operation.^{[38]} If a hash table becomes "too empty" after deleting some elements, resizing may be performed to avoid excessive memory usage.^{[39]}
Resizing by moving all entries
Generally, a new hash table with a size double that of the original hash table gets allocated privately and every item in the original hash table gets moved to the newly allocated one by computing the hash values of the items followed by the insertion operation. Rehashing is simple, but computationally expensive.^{[40]}^{:{{{1}}}}
Alternatives to allatonce rehashing
Some hash table implementations, notably in realtime systems, cannot pay the price of enlarging the hash table all at once, because it may interrupt timecritical operations. If one cannot avoid dynamic resizing, a solution is to perform the resizing gradually to avoid storage blip—typically at 50% of new table's size—during rehashing and to avoid memory fragmentation that triggers heap compaction due to deallocation of large memory blocks caused by the old hash table.^{[41]}^{:{{{1}}}} In such case, the rehashing operation is done incrementally through extending prior memory block allocated for the old hash table such that the buckets of the hash table remain unaltered. A common approach for amortized rehashing involves maintaining two hash functions [math]\displaystyle{ h_\text{old} }[/math] and [math]\displaystyle{ h_\text{new} }[/math]. The process of rehashing a bucket's items in accordance with the new hash function is termed as cleaning, which is implemented through command pattern by encapsulating the operations such as [math]\displaystyle{ \mathrm{Add}(\mathrm{key}) }[/math], [math]\displaystyle{ \mathrm{Get}(\mathrm{key}) }[/math] and [math]\displaystyle{ \mathrm{Delete}(\mathrm{key}) }[/math] through a [math]\displaystyle{ \mathrm{Lookup}(\mathrm{key}, \text{command}) }[/math] wrapper such that each element in the bucket gets rehashed and its procedure involve as follows:^{[41]}^{:3}
 Clean [math]\displaystyle{ \mathrm{Table}[h_\text{old}(\mathrm{key})] }[/math] bucket.
 Clean [math]\displaystyle{ \mathrm{Table}[h_\text{new}(\mathrm{key})] }[/math] bucket.
 The command gets executed.
Linear hashing
Linear hashing is an implementation of the hash table which enables dynamic growths or shrinks of the table one bucket at a time.^{[42]}
Performance
The performance of a hash table is dependent on the hash function's ability in generating quasirandom numbers ([math]\displaystyle{ \sigma }[/math]) for entries in the hash table where [math]\displaystyle{ K }[/math], [math]\displaystyle{ n }[/math] and [math]\displaystyle{ h(x) }[/math] denotes the key, number of buckets and the hash function such that [math]\displaystyle{ \sigma\ =\ h(K)\ \%\ n }[/math]. If the hash function generates the same [math]\displaystyle{ \sigma }[/math] for distinct keys ([math]\displaystyle{ K_1 \ne K_2,\ h(K_1)\ =\ h(K_2) }[/math]), this results in collision, which is dealt with in a variety of ways. The constant time complexity ([math]\displaystyle{ O(1) }[/math]) of the operation in a hash table is presupposed on the condition that the hash function doesn't generate colliding indices; thus, the performance of the hash table is directly proportional to the chosen hash function's ability to disperse the indices.^{[43]}^{:1} However, construction of such a hash function is practically infeasible, that being so, implementations depend on casespecific collision resolution techniques in achieving higher performance.^{[43]}^{:2}
Applications
Associative arrays
Hash tables are commonly used to implement many types of inmemory tables. They are used to implement associative arrays.^{[29]}
Database indexing
Hash tables may also be used as diskbased data structures and database indices (such as in dbm) although Btrees are more popular in these applications.^{[44]}
Caches
Hash tables can be used to implement caches, auxiliary data tables that are used to speed up the access to data that is primarily stored in slower media. In this application, hash collisions can be handled by discarding one of the two colliding entries—usually erasing the old item that is currently stored in the table and overwriting it with the new item, so every item in the table has a unique hash value.^{[45]}^{[46]}
Sets
Hash tables can be used in the implementation of set data structure, which can store unique values without any particular order; set is typically used in testing the membership of a value in the collection, rather than element retrieval.^{[47]}
Transposition table
A transposition table to a complex Hash Table which stores information about each section that has been searched.^{[48]}
Implementations
Many programming languages provide hash table functionality, either as builtin associative arrays or as standard library modules.
In JavaScript, an "object" is a mutable collection of keyvalue pairs (called "properties"), where each key is either a string or a guaranteedunique "symbol"; any other value, when used as a key, is first coerced to a string. Aside from the seven "primitive" data types, every value in JavaScript is an object.^{[49]} ECMAScript 2015 also added the Map
data structure, which accepts arbitrary values as keys.^{[50]}
C++11 includes unordered map
in its standard library for storing keys and values of arbitrary types.^{[51]}
Go's builtin map
implements a hash table in the form of a type.^{[52]}
Java programming language includes the HashSet
, HashMap
, LinkedHashSet
, and LinkedHashMap
generic collections.^{[53]}
Python's builtin dict
implements a hash table in the form of a type.^{[54]}
Ruby's builtin Hash
uses the open addressing model from Ruby 2.4 onwards.^{[55]}
Rust programming language includes HashMap
, HashSet
as part of the Rust Standard Library. ^{[56]}
The .NET standard library includes HashSet
and Dictionary
,^{[57]}^{[58]} so it can be used from languages such as C# and VB.NET.^{[59]}
See also
 Bloom filter
 Consistent hashing
 Distributed hash table
 Extendible hashing
 Hash array mapped trie
 Lazy deletion
 Pearson hashing
 PhotoDNA
 Rabin–Karp string search algorithm
 Search data structure
 Stable hashing
References
 ↑ ^{1.0} ^{1.1} Introduction to Algorithms (3rd ed.). Massachusetts Institute of Technology. 2009. pp. 253–280. ISBN 9780262033848.
 ↑ Mehlhorn, Kurt; Sanders, Peter (2008), "4 Hash Tables and Associative Arrays", Algorithms and Data Structures: The Basic Toolbox, Springer, pp. 81–98, http://people.mpiinf.mpg.de/~mehlhorn/ftp/Toolbox/HashTables.pdf
 ↑ Leiserson, Charles E. (Fall 2005). "Lecture 13: Amortized Algorithms, Table Doubling, Potential Method". course MIT 6.046J/18.410J Introduction to Algorithms. http://videolectures.net/mit6046jf05_leiserson_lec13/.
 ↑ ^{4.0} ^{4.1} Knuth, Donald (1998). The Art of Computer Programming. 3: Sorting and Searching (2nd ed.). AddisonWesley. pp. 513–558. ISBN 9780201896855.
 ↑ Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L.; Stein, Clifford (2001). "Chapter 11: Hash Tables". Introduction to Algorithms (2nd ed.). MIT Press and McGrawHill. pp. 221–252. ISBN 9780262531962.
 ↑ ^{6.0} ^{6.1} ^{6.2} ^{6.3} ^{6.4} Sedgewick, Robert; Wayne, Kevin (2011). Algorithms. 1 (4 ed.). AddisonWesley Professional. https://algs4.cs.princeton.edu/.
 ↑ ^{7.00} ^{7.01} ^{7.02} ^{7.03} ^{7.04} ^{7.05} ^{7.06} ^{7.07} ^{7.08} ^{7.09} ^{7.10} ^{7.11} ^{7.12} ^{7.13} Mehta, Dinesh P.; Sahni, Sartaj (28 October 2004). "9: Hash Tables". Handbook of Data Structures and Applications (1 ed.). Taylor & Francis. doi:10.1201/9781420035179. ISBN 9781584884354. https://www.taylorfrancis.com/books/mono/10.1201/9781420035179/handbookdatastructuresapplicationsdineshmehtadineshmehtasartajsahni.
 ↑ ^{8.0} ^{8.1} ^{8.2} Konheim, Alan G. (21 June 2010). Hashing in Computer Science: Fifty Years of Slicing and Dicing. John Wiley & Sons, Inc.. doi:10.1002/9780470630617. ISBN 9780470630617. https://onlinelibrary.wiley.com/doi/book/10.1002/9780470630617.
 ↑ ^{9.0} ^{9.1} ^{9.2} ^{9.3} ^{9.4} ^{9.5} ^{9.6} ^{9.7} ^{9.8} Mayers, Andrew (2008). "CS 312: Hash tables and amortized analysis". Cornell University, Department of Computer Science. https://www.cs.cornell.edu/courses/cs312/2008sp/lectures/lec20.html.
 ↑ ^{10.0} ^{10.1} James S. Plank and Brad Vander Zanden. "CS140 Lecture notes  Hashing".
 ↑ Maurer, W.D.; Lewis, T.G. (1 March 1975). "Hash Table Methods". ACM Computing Surveys (Journal of the ACM) 1 (1): 14. doi:10.1145/356643.356645. https://dl.acm.org/doi/10.1145/356643.356645.
 ↑ ^{12.0} ^{12.1} Owolabi, Olumide (1 February 2003). "Empirical studies of some hashing functions". Information and Software Technology (Department of Mathematics and Computer Science, University of Port Harcourt) 45 (2): 109–112. doi:10.1016/S09505849(02)00174X. https://www.sciencedirect.com/science/article/abs/pii/S095058490200174X.
 ↑ ^{13.0} ^{13.1} Lu, Yi; Prabhakar, Balaji; Bonomi, Flavio (2006). "Perfect Hashing for Network Applications". 2006 IEEE International Symposium on Information Theory. pp. 2774–2778. doi:10.1109/ISIT.2006.261567. ISBN 142440505X.
 ↑ Belazzougui, Djamal; Botelho, Fabiano C.; Dietzfelbinger, Martin (2009). "Hash, displace, and compress". 5757. Berlin: Springer. pp. 682–693. doi:10.1007/9783642041280_61. http://cmph.sourceforge.net/papers/esa09.pdf.
 ↑ ^{15.0} ^{15.1} Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L.; Stein, Clifford (2001). "Chapter 11: Hash Tables". Introduction to Algorithms (2nd ed.). Massachusetts Institute of Technology. ISBN 9780262531962.
 ↑ Pearson, Karl (1900). "On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling". Philosophical Magazine. Series 5 50 (302): 157–175. doi:10.1080/14786440009463897. https://zenodo.org/record/1430618.
 ↑ Plackett, Robin (1983). "Karl Pearson and the ChiSquared Test". International Statistical Review 51 (1): 59–72. doi:10.2307/1402731.
 ↑ ^{18.0} ^{18.1} Wang, Thomas (March 1997). "Prime Double Hash Table". https://www.concentric.net/~Ttwang/tech/primehash.htm.
 ↑ Wegman, Mark N.; Carter, J. Lawrence (1981). "New hash functions and their use in authentication and set equality". Journal of Computer and System Sciences 22 (3): 265–279. doi:10.1016/00220000(81)900337. Conference version in FOCS'79. http://www.fi.muni.cz/~xbouda1/teaching/2009/IV111/Wegman_Carter_1981_New_hash_functions.pdf. Retrieved 9 February 2011.
 ↑ ^{20.0} ^{20.1} ^{20.2} Donald E. Knuth (24 April 1998). The Art of Computer Programming: Volume 3: Sorting and Searching. AddisonWesley Professional. ISBN 9780201896855. https://dl.acm.org/doi/10.5555/280635.
 ↑ Demaine, Erik; Lind, Jeff (Spring 2003). "Lecture 2". 6.897: Advanced Data Structures. MIT Computer Science and Artificial Intelligence Laboratory. http://courses.csail.mit.edu/6.897/spring03/scribe_notes/L2/lecture2.pdf.
 ↑ ^{22.0} ^{22.1} ^{22.2} Askitis, Nikolas; Zobel, Justin (2005). "CacheConscious Collision Resolution in String Hash Tables". International Symposium on String Processing and Information Retrieval. Springer Science+Business Media. pp. 91–102. doi:10.1007/11575832_1. ISBN 9783540297406. https://link.springer.com/chapter/10.1007/11575832_11.
 ↑ Willard, Dan E. (2000). "Examining computational geometry, van Emde Boas trees, and hashing from the perspective of the fusion tree". SIAM Journal on Computing 29 (3): 1030–1049. doi:10.1137/S0097539797322425..
 ↑ Askitis, Nikolas; Sinha, Ranjan (2010). "Engineering scalable, cache and space efficient tries for strings". The VLDB Journal 17 (5): 634. doi:10.1007/s0077801001839. ISSN 10668888.
 ↑ Askitis, Nikolas; Zobel, Justin (October 2005). "Cacheconscious Collision Resolution in String Hash Tables". 3772/2005. pp. 91–102. doi:10.1007/11575832_11. ISBN 9783540297406.
 ↑ Askitis, Nikolas (2009). "Fast and Compact Hash Tables for Integer Keys". 91. pp. 113–122. ISBN 9781920682729. http://crpit.com/confpapers/CRPITV91Askitis.pdf. Retrieved June 13, 2010.
 ↑ Tenenbaum, Aaron M.; Langsam, Yedidyah; Augenstein, Moshe J. (1990). Data Structures Using C. Prentice Hall. pp. 456–461, p. 472. ISBN 9780131997462.
 ↑ ^{28.0} ^{28.1} Pagh, Rasmus; Rodler, Flemming Friche (2001). "Cuckoo Hashing". Algorithms — ESA 2001. Lecture Notes in Computer Science. 2161. pp. 121–133. doi:10.1007/3540446761_10. ISBN 9783540424932.
 ↑ ^{29.0} ^{29.1} ^{29.2} Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L.; Stein, Clifford (2001), "11 Hash Tables", Introduction to Algorithms (2nd ed.), MIT Press and McGrawHill, pp. 221–252, ISBN 0262032937.
 ↑ ^{30.0} ^{30.1} ^{30.2} Vitter, Jeffery S.; Chen, WenChin (1987). The design and analysis of coalesced hashing. New York, United States: Oxford University Press. ISBN 9780195041828. https://archive.org/details/designanalysisof0000vitt/.
 ↑ Pagh, Rasmus; Rodler, Flemming Friche (2001). "Cuckoo Hashing". Algorithms — ESA 2001. Lecture Notes in Computer Science. 2161. pp. 121–133. doi:10.1007/3540446761_10. ISBN 9783540424932.
 ↑ ^{32.0} ^{32.1} ^{32.2} ^{32.3} ^{32.4} ^{32.5} Herlihy, Maurice; Shavit, Nir; Tzafrir, Moran (2008). "Hopscotch Hashing". International Symposium on Distributed Computing. 5218. Berlin, Heidelberg: Springer Publishing. pp. 350–364. doi:10.1007/9783540877790_24. ISBN 9783540877783. https://link.springer.com/chapter/10.1007/9783540877790_24.
 ↑ ^{33.0} ^{33.1} Celis, Pedro (1986). Robin Hood Hashing. Ontario, Canada: University of Waterloo, Dept. of Computer Science. ISBN 031529700X. OCLC 14083698. https://cs.uwaterloo.ca/research/tr/1986/CS8614.pdf. Retrieved 2 November 2021.
 ↑ Poblete, P.V.; Viola, A. (14 August 2018). "Analysis of Robin Hood and Other Hashing Algorithms Under the Random Probing Model, With and Without Deletions". Combinatorics, Probability and Computing (Cambridge University Press) 28 (4): 600–617. doi:10.1017/S0963548318000408. ISSN 14692163. https://www.cambridge.org/core/journals/combinatoricsprobabilityandcomputing/article/abs/analysisofrobinhoodandotherhashingalgorithmsundertherandomprobingmodelwithandwithoutdeletions/933D4F203E3C70EF15053287412242E0. Retrieved 1 November 2021.
 ↑ Clarkson, Michael (2014). "Lecture 13: Hash tables". Cornell University, Department of Computer Science. https://www.cs.cornell.edu/courses/cs3110/2014fa/lectures/13/lec13.html.
 ↑ Gries, David (2017). "JavaHyperText and Data Structure: Robin Hood Hashing". Cornell University, Department of Computer Science. https://www.cs.cornell.edu/courses/JavaAndDS/files/hashing_RobinHood.pdf.
 ↑ Template:Cite tech report
 ↑ Goddard, Wayne (2021). "Chapter C5: Hash Tables". Clemson University. pp. 15–16. https://people.computing.clemson.edu/~goddard/texts/algor/C5.pdf.
 ↑ Devadas, Srini; Demaine, Erik (25 February 2011). "Intro to Algorithms: Resizing Hash Tables". Massachusetts Institute of Technology, Department of Computer Science. https://courses.csail.mit.edu/6.006/spring11/rec/rec07.pdf.
 ↑ Thareja, Reema (13 October 2018). "Hashing and Collision". Data Structures Using C (2 ed.). Oxford University Press. ISBN 9780198099307. https://global.oup.com/academic/product/datastructuresusingc9780198099307.
 ↑ ^{41.0} ^{41.1} Friedman, Scott; Krishnan, Anand; Leidefrost, Nicholas (18 March 2003). "Hash Tables for Embedded and Realtime systems". All Computer Science and Engineering Research (Washington University in St. Louis). doi:10.7936/K7WD3XXV. https://users.cs.northwestern.edu/~sef318/docs/hashtables.pdf. Retrieved 9 November 2021.
 ↑ Litwin, Witold (1980). "Linear hashing: A new tool for file and table addressing". Carnegie Mellon University. pp. 212–223. https://www.cs.cmu.edu/afs/cs.cmu.edu/user/christos/www/courses/826resources/PAPERS+BOOK/linearhashing.PDF. Retrieved 10 November 2021.
 ↑ ^{43.0} ^{43.1} Dijk, Tom Van (2010). "Analysing and Improving Hash Table Performance". Netherlands: University of Twente. https://www.tvandijk.nl/pdf/bscthesis.pdf.
 ↑ Lech Banachowski. "Indexes and external sorting". :pl:PolskoJapońska Akademia Technik Komputerowych. https://edux.pjwstk.edu.pl/mat/262/lec/rW9.htm.
 ↑ Zhong, Liang; Zheng, Xueqian; Liu, Yong; Wang, Mengting; Cao, Yang (February 2020). "Cache hit ratio maximization in devicetodevice communications overlaying cellular networks". China Communications 17 (2): 232–238. doi:10.23919/jcc.2020.02.018. ISSN 16735447. http://dx.doi.org/10.23919/jcc.2020.02.018.
 ↑ Bottommley, James (1 January 2004). "Understanding Caching". Linux Journal. https://www.linuxjournal.com/article/7105.
 ↑ Jill Seaman (2014). "Set & Hash Tables". Texas State University. https://userweb.cs.txstate.edu/~js236/201412/cs5301/week13.pdf.
 ↑ "Transposition Table  Chessprogramming wiki". https://www.chessprogramming.org/Transposition_Table.
 ↑ "JavaScript data types and data structures  JavaScript  MDN". https://developer.mozilla.org/enUS/docs/Web/JavaScript/Data_structures#objects.
 ↑ "Map  JavaScript  MDN" (in enUS). 20230620. https://developer.mozilla.org/enUS/docs/Web/JavaScript/Reference/Global_Objects/Map.
 ↑ "Programming language C++  Technical Specification". International Organization for Standardization. pp. 812–813. http://www.openstd.org/jtc1/sc22/wg21/docs/papers/2013/n3690.pdf.
 ↑ "The Go Programming Language Specification". https://go.dev/ref/spec#Map_types.
 ↑ "Lesson: Implementations (The Java™ Tutorials > Collections)". https://docs.oracle.com/javase/tutorial/collections/implementations/index.html.
 ↑ Zhang, Juan; Jia, Yunwei (2020). "Redis rehash optimization based on machine learning". Journal of Physics 1453 (1): 3. doi:10.1088/17426596/1453/1/012048. Bibcode: 2020JPhCS1453a2048Z.
 ↑ Jonan Scheffler (December 25, 2016). "Ruby 2.4 Released: Faster Hashes, Unified Integers and Better Rounding". https://blog.heroku.com/ruby24featureshashesintegersrounding#hashchanges.
 ↑ "doc.rustlang.org". https://doc.rustlang.org/std/index.html. test
 ↑ "HashSet Class (System.Collections.Generic)" (in enus). https://learn.microsoft.com/enus/dotnet/api/system.collections.generic.hashset1?view=net7.0.
 ↑ dotnetbot. "Dictionary Class (System.Collections.Generic)" (in enus). https://learn.microsoft.com/enus/dotnet/api/system.collections.generic.dictionary2?view=net8.0.
 ↑ "VB.NET HashSet Example". https://www.dotnetperls.com/hashsetvbnet.
Further reading
 Tamassia, Roberto; Goodrich, Michael T. (2006). "Chapter Nine: Maps and Dictionaries". Data structures and algorithms in Java : [updated for Java 5.0] (4th ed.). Hoboken, NJ: Wiley. pp. 369–418. ISBN 9780471738848. https://archive.org/details/datastructuresal00good_183.
 McKenzie, B. J.; Harries, R.; Bell, T. (Feb 1990). "Selecting a hashing algorithm". Software: Practice and Experience 20 (2): 209–224. doi:10.1002/spe.4380200207.
External links
 NIST entry on hash tables
 Open Data Structures – Chapter 5 – Hash Tables, Pat Morin
 MIT's Introduction to Algorithms: Hashing 1 MIT OCW lecture Video
 MIT's Introduction to Algorithms: Hashing 2 MIT OCW lecture Video
Original source: https://en.wikipedia.org/wiki/Hash table.
Read more 