SUHA (computer science)

In computer science, SUHA (Simple Uniform Hashing Assumption) is a basic assumption that facilitates the mathematical analysis of hash tables. The assumption states that a hypothetical hashing function will evenly distribute items into the slots of a hash table. Moreover, each item to be hashed has an equal probability of being placed into a slot, regardless of the other elements already placed. This assumption generalizes the details of the hash function and allows for certain assumptions about the stochastic system.

Applications

SUHA is most commonly used as a foundation for mathematical proofs describing the properties and behavior of hash tables in theoretical computer science. Minimizing hashing collisions can be achieved with a uniform hashing function. These functions often rely on the specific input data set and can be quite difficult to implement. Assuming uniform hashing allows hash table analysis to be made without exact knowledge of the input or the hash function used.

Mathematical implications

Certain properties of hash tables can be derived once uniform hashing is assumed.

Uniform distribution

Under the assumption of uniform hashing, given a hash function h, and a hash table of size m, the probability that two non-equal elements will hash to the same slot is

P (h (a) = h (b)) = \frac{1}{m} .

Collision chain length

Under the assumption of uniform hashing, the load factor $α$ and the average chain length of a hash table of size m with n elements will be

α = \frac{n}{m}

Successful lookup

Under the assumption of uniform hashing, the average time (in big-O notation) to successfully find an element in a hash table using chaining is

O (α + 1)

Unsuccessful lookup

Under the assumption of uniform hashing, the average time (in big-O notation) to unsuccessfully find an element in a hash table using chaining is

O (α + 1)

Example

A simple example of using SUHA can be seen while observing an arbitrary hash table of size 10 and a data set of 30 unique elements. If chaining is used to deal with collisions, the average chain length of this hash table may be a desirable value. Without any assumptions and with no more additional information about the data or hash function, the chain length cannot be estimated. With SUHA however, we can state that because of an assumed uniform hashing, each element has an equal probability of mapping to a slot. Since no particular slot should be favored over another, the 30 elements should hash into the 10 slots uniformly. This will produce a hash table with, on average, 10 chains each of length 3

α = \frac{n}{m}

α = \frac{30}{10}

α = 3

References

General

Collins, William (2004). "Section 14.3.2: The Uniform Hashing Assumption". Data Structures and the Java Collections Framework. McGraw-Hill. pp. 608. ISBN 0-07-282379-8.
Cormen, Thomas H. (2001). "Section 11.2: Hash Tables". MIT Press and McGraw-Hill. pp. 226–228. ISBN 0-262-03293-7.

0.00

(0 votes)

Original source: https://en.wikipedia.org/wiki/SUHA (computer science). Read more

Anonymous

Search

SUHA (computer science)

Namespaces

More

Page actions

Contents

Applications

Mathematical implications

Uniform distribution

Collision chain length

Successful lookup

Unsuccessful lookup

Example

See also

References

General

Navigation

Navigation

Resources

Help

googletranslator

Navigation

Wiki tools

Wiki tools

Anonymous

Search

SUHA (computer science)

Applications

Mathematical implications

Uniform distribution

Collision chain length

Successful lookup

Unsuccessful lookup

Example

See also

References

General

Navigation

Wiki tools

Page tools

Other projects

Categories