Pseudo-random number sampling

Short description: Generating pseudo-random numbers that follow a probability distribution

Pseudo-random number sampling or non-uniform pseudo-random variate generation is the numerical practice of generating pseudo-random numbers that are distributed according to a given probability distribution.

Methods of sampling a non-uniform distribution are typically based on the availability of a pseudo-random number generator producing numbers X that are uniformly distributed. Computational algorithms are then used to manipulate a single random variate, X, or often several such variates, into a new random variate Y such that these values have the required distribution.

Historically, basic methods of pseudo-random number sampling were developed for Monte-Carlo simulations in the Manhattan project;^{[citation needed]} they were first published by John von Neumann in the early 1950s.^[1]

Finite discrete distributions

For a discrete probability distribution with a finite number n of indices at which the probability mass function f takes non-zero values, the basic sampling algorithm is straightforward. The interval [0, 1) is divided in n intervals [0, f(1)), [f(1), f(1) + f(2)), ... The width of interval i equals the probability f(i). One draws a uniformly distributed pseudo-random number X, and searches for the index i of the corresponding interval. The so determined i will have the distribution f(i).

Formalizing this idea becomes easier by using the cumulative distribution function

[math]\displaystyle{ F(i)=\sum_{j=1}^i f(j). }[/math]

It is convenient to set F(0) = 0. The n intervals are then simply [F(0), F(1)), [F(1), F(2)), ..., [F(n − 1), F(n)). The main computational task is then to determine i for which F(i − 1) ≤ X < F(i).

This can be done by different algorithms:

Linear search, computational time linear in n.
Binary search, computational time goes with log n.
Indexed search,^[2] also called the cutpoint method.^[3]
Alias method, computational time is constant, using some pre-computed tables.
There are other methods that cost constant time.^[4]

Continuous distributions

Generic methods for generating independent samples:

Rejection sampling for arbitrary density functions
Inverse transform sampling for distributions whose CDF is known
Ratio of uniforms, combining a change of variables and rejection sampling
Slice sampling
Ziggurat algorithm, for monotonically decreasing density functions as well as symmetric unimodal distributions
Convolution random number generator, not a sampling method in itself: it describes the use of arithmetics on top of one or more existing sampling methods to generate more involved distributions.

Generic methods for generating correlated samples (often necessary for unusually-shaped or high-dimensional distributions):

Markov chain Monte Carlo, the general principle
Metropolis–Hastings algorithm
Gibbs sampling
Slice sampling
Reversible-jump Markov chain Monte Carlo, when the number of dimensions is not fixed (e.g. when estimating a mixture model and simultaneously estimating the number of mixture components)
Particle filters, when the observed data is connected in a Markov chain and should be processed sequentially

For generating a normal distribution:

For generating a Poisson distribution:

See Poisson distribution

Software libraries

GNU Scientific Library has a section entitled "Random Number Distributions" with routines for sampling under more than twenty different distributions.

Footnotes

↑ Von Neumann, John (1951). "Various Techniques Used in Connection with Random Digits". Monte Carlo Methods. National Bureau of Standards Applied Mathematics Series. 12. US Government Printing Office. pp. 36–38. https://mcnp.lanl.gov/pdf_files/nbs_vonneumann.pdf. "Any one who considers arithmetical methods of producing random digits is of course, in a state of sin." Also online is a low-quality scan of the original publication.
↑ Ripley (1987)^{[page needed]}
↑ Fishman (1996)^{[page needed]}
↑ Fishman (1996)^{[page needed]}

Literature

Devroye, L. (1986) Non-Uniform Random Variate Generation. New York: Springer
Fishman, G.S. (1996) Monte Carlo. Concepts, Algorithms, and Applications. New York: Springer
Hörmann, W.; J Leydold, G Derflinger (2004,2011) Automatic Nonuniform Random Variate Generation. Berlin: Springer.
Knuth, D.E. (1997) The Art of Computer Programming, Vol. 2 Seminumerical Algorithms, Chapter 3.4.1 (3rd edition).
Ripley, B.D. (1987) Stochastic Simulation. Wiley.

0.00

(0 votes)

[1] Von Neumann, John (1951). "Various Techniques Used in Connection with Random Digits". Monte Carlo Methods. National Bureau of Standards Applied Mathematics Series. 12. US Government Printing Office. pp. 36–38. https://mcnp.lanl.gov/pdf_files/nbs_vonneumann.pdf. "Any one who considers arithmetical methods of producing random digits is of course, in a state of sin." Also online is a low-quality scan of the original publication.

[2] Ripley (1987)^{[page needed]}

[3] Fishman (1996)^{[page needed]}

[4] Fishman (1996)^{[page needed]}

[1]

[2]

[3]

[4]

Anonymous

Search

Pseudo-random number sampling

Namespaces

More

Page actions

Contents

Finite discrete distributions

Continuous distributions

Software libraries

Footnotes

Literature

Navigation

Navigation

Help

Translate

Wiki tools

Wiki tools

Anonymous

Search

Pseudo-random number sampling

Finite discrete distributions

Continuous distributions

Software libraries

Footnotes

Literature

Navigation

Wiki tools

Page tools

Other projects

Categories