DScience:Random numbers

From HandWiki
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.
Member


30% complete
   


Random numbers

A random number is a number that is chosen by chance from a specified function or empirical distribution. It is an important concept in many scientific areas, especially for numeric computations and simulations of complex systems using Monte Carlo methods. Random numbers are used for estimating integrals, generating data encryption keys, data interpretation, simulation and modeling complex phenomena. We will use random numbers to simulate "fake" data sets for illustrating numerical and statistical techniques.

Typically, no individual number can be predicted from knowledge of any other number or group of numbers. However, sequences of random numbers in a computer simulation eventually contain repeated numbers after generation of many millions of random numbers. Thus, it is only a good approximation to say that the numbers are random, and the definition “pseudo-random” is more appropriate. Another notion which is usually associated with a sequence of random numbers is the so-called “seed” value. This is a number that controls whether the random number generator produces a new set of random numbers after the code execution or repeats a certain sequence of random numbers.

It is often necessary to start generating exactly the same random number sequence every time you start the program when debugging it In this case, one should initialize a random number generator using the same seed number. In many code examples of this book, we use the Java class Random(L) to create a sequence of random number, where L is a long integer. This means we initialize the generator from the same seed, i.e. our sequence is reproducible every time we run the code.

The seed must be changed for each run if you want to produce completely different sets of random numbers every time the program is executed. Usually, this can be done by generating a new seed using the current date and time, converted to an integer value. This is also the case when calling the class Random() without arguments.

Using random numbers from Python

The standard Python module implementing a random number generator is called “random”. It must be imported using the usual statement import Let us give a simple example that shows how to generate a random floating point number in the range [0,1] using the Python random module.

Since we do not specify any argument for the Random() statement, a random seed from the current system time is used. In this case, every time you execute this script, a new random number will be generated.

In order to generate a random number predictably for debugging purpose, one should pass an integer (or long) value to an instance of the Random() class. For the above code, this may look as: r=Random(100L). Now the behavior of the script above will be different: every time when you execute this script, the method randint(1,10) will return the same random value, since the seed value is fixed.

Random numbers in Python can be generated using various distributions depending on the applied method:

>>> r.betavariate(a,b)    # Beta distribution (a>0,b>0)
>>> r.expovariate(lambda) # Exponential distribution
>>> r.gammavariate(a, b)  # Gamma distribution.
>>> r.gauss(m,s)          # Gaussian distribution
>>> r.lognormvariate(m,s) # Log normal distribution
>>> r.normalvariate(m,s)  # Normal distribution
>>> r.randint(min,max)    # int in range [min,max]
>>> r.random()            # in range [0.0, 1.0)
>>> r.uniform(min,max)    # real number in [min,max]

In the examples above, “m” denotes a mean value and “s” represents a standard deviation for the output distributions.

One can reseed the random numbers, and obtain and reset the internal state of random-number generator using the methods:

Note that if “i” in the method seed(i) is omitted, the current system time is used to initialize the generator.

Random numbers are also used for manipulations with Python/Jython lists. One can randomly rearrange elements in a list as:

which will print

[2, 4, 2, 7, 6, 5, 9, 8, 1] # random list 

One can pick up a random value from a list as:

That is all it takes. Similarly, one can get a random sample of elements as:

Every time you execute this code, the printed numbers will be different in your case.

Using random numbers from Java

Jython gives you a much larger choice for random numbers since you can use the native Java and other 3rd party libraries. It is advisable to use the standard Java libraries to create arrays with random numbers where possible, instead of filling lists with random values using Python loops. There are several reasons for this: 1) less chances that a mistake can be made; 2) programs based on the standard Java libraries can be faster; 3) code with calls to Java libraries can be reused by the standard Java programs, or by programs based on alternative scripting languages supported by Java.

First we will discuss the most common classes to generate random numbers in Java. Random numbers provided by the Java API have already been used in the previous sections. Let us remind that the class Random can be used to generate a single random number. Below we check the methods of this class:

The last command prints

[.. "nextDouble", "nextFloat", "nextGaussian",

..."nextInt", "nextLong" ..]

In the first definition, the default seed comes from the computer system time. In the second example, we initiate the random sequence from an input value to obtain reproducible results for every program execution. Here we describe the most common methods to generate random numbers. As usual, “i” denotes an integer value, “l” represents a long integer value, “d” means a double value, while “b” corresponds to a boolean value.

i=r.nextInt(n)
random int [math]\displaystyle{ \geq }[/math] 0 and [math]\displaystyle{ \le }[/math] n
i=r.nextInt()
random int (full range)
l=r.nextLong()
random long (full range)
d=r.nextDouble()
random double [math]\displaystyle{ \geq }[/math] 0.0 and [math]\displaystyle{ \le }[/math] 1.0
b=r.nextBoolean()
random double, true (1) or false (0)
d=r.nextGaussian()
random with the mean 0.0 and the

To build a list containing random numbers, invoke a Python loop. For example, this code typed using the Jython Shell creates a list with Gaussian random numbers using the Java class java.util.Random java.util.Random:

Python loops are not particularly fast, therefore, it is recommended to use the predefined DMelt methods to build lists. In addition, using the predefined methods to build collections with random numbers grantees that the code is sufficiently short and free of errors.

Note that high-level Java libraries are more efficient than using long loops in Jython codes. The reason for this is that we use Java high-level methods,i instead of Python loops, so the engine behind the calculation is significantly more optimized for speed.

Random numbers in Colt

The Colt package provides a comprehensive list of methods to create random numbers. The Java classes necessary to build random numbers come from the package cern.jet cern.jet. As example, let us consider a generation of reproducible random numbers using the cern.jet.random.engine.MersenneTwister cern.jet.random.engine.MersenneTwister class from the sub-package random.engine. The macro below creates an array jhplot.P0D jhplot.P0D with random numbers and then prints the statistical summary of a Gamma distribution:

Here we used the so-called “Mersenne-Twister” algorithm, which is one of the strongest uniform pseudo-random number generators. We did not specify any argument for the engine, therefore, the seed is set to a constant value and the output is reproducible next time you run the script. One can use the current system date for a seed to avoid reproducible results:

Learn about all possible methods of this package as usual:

import cern.jet.random
dir(cern.jet.random)  

The above command prints the implemented distributions:

Beta, Binomial, BreitWigner, BreitWignerMeanSquare, 
ChiSquare, Empirical, EmpiricalWalker, Exponential,
ExponentialPower, Gamma, Hyperbolic, HyperGeometric, 
Logarithmic, NegativeBinomial, Normal, Poisson, 
PoissonSlow, StudentT, Uniform, VonMises, Zeta 

All these classes operate on a user supplied uniform random number generator.

Once you know which random number is necessary for your program, use the code assist or Java API documentation to learn more.

There is one special distribution you have to be aware of. One can generate an array of random numbers that follow a predefined probability distribution function (PDF). Such a distribution is called “Empirical”. The PDF should be provided as an array of positive numbers. The function can be in the form of relative probabilities, but the absolute probabilities are also accepted. If LINEAR_INTERPOLATION constant is set, a linear interpolation within the bin is computed, resulting in a constant density within each bin. When NO_INTERPOLATION is passed, no interpolation is performed and the result is a discrete distribution. Let us see this:

Look also at the class EmpiricalWalker which implements the so-called Walker’s algorithm.

3rd party Java packages

jMathTool  classes further extend the Java random number generators. This library is included into the package hplot.math hplot.math. The example below shows how to generate random numbers using the jMathTool package:

Below we will show other possible options:

uniform(mi,ma)
a random number between mi and ma.
dirac(d[], p[])
a random number from a discrete random variable, where d[] array with discrete values, and p[] is the probability of each value.
normal(m,s)
a random number from a Gaussian (Normal) distribution with the mean (“m”) and the standard deviation (“s”).
chi2(i)
a random number from a [math]\displaystyle{ \xi^2 }[/math] random variable with “i” degrees of freedom.
logNormal(m,s)
a LogNormal random variable with the mean (“m”) and the standard deviation (“s”).
exponential(lam)
a random number from an exponential distribution (mean = 1/lam, variance = 1/lam**2).
triangular(mi,ma)
a random number from a symmetric triangular distribution
triangular(mi,med,ma)
a random number from a non-symmetric triangular distribution (“med” means a value of the random variable with ma density)
beta(a,b)
a random number from a Beta distribution. “a” and “b” is the first and second parameter of the Beta random variable
cauchy(med,s)
a random number from a Cauchy distribution (Mean = Inf, and Variance = Inf). “med” is a median of the Weibull random variable, “s” is the second parameter of the Cauchy random variable.
weibull(lam,c)
a random number from a Weibull distribution. “lam” is the first parameter of the Weibull random variable, “c” is the second parameter of the Weibull random variable.

Finally, one can generate a random number from an analytic function using the known rejection method. This requires building a F1D function first and then passing its parsed object to the rejection() method. Below we show how this can be done:

The last line prints 1.4. The method rejection() takes three arguments: a parsed function, a maximum value of the function ([math]\displaystyle{ 15 }[/math] in this case) and a minimum and a maximum value for the abscissa. The method returns a random number between 1 and 2, since these numbers have been specified in the rejection() method.

Random numbers can also be generated using the Apache common math package. A random generator can be initialized using the Java class called RandomDataGenerator(). After the initialization, call its methods to generate a random number. The code below shows how to generate a single random number from different distributions:

Check org.apache.commons.math3.random.RandomDataGenerator org.apache.commons.math3.random.RandomDataGenerator for more options. One can reseed the random numbers using an integer argument seed of the constructor. This integer value sets the seed of the generator to the current computer time in milliseconds. One can also reseeds the random number generator with the supplied seed using the method reSeed(i), where “i” is an arbitrary integer number.

The fact that one can define characteristics of random numbers in the same method that returns the random value is quite useful. For example, one can easily convolute several random distributions, and create a new random distribution. To be more specific, let us convolute a Poisson distribution with a Gamma distribution:

This tutorial is provided under this license agreement.