DScience:Discrete probability distributions and their characteristics
30% complete | ||
|
What is discrete probability distribution?
Discrete probability distribution is a Probability_distribution of a discrete random variable X. Each possible value of the discrete random variable can be associated with a non-zero probability. A discrete probability distribution is often presented as tables.
Examples:
- The number of people going to a given shop per day.
- The number of students that come to class on a given day.
Implementations
A number of Java libraries are available to create random discrete distributions:
jhplot.math.num.pdf.package-summary
javanpst.distributions.common.discrete.package-summary
org.apache.commons.math3.distribution.IntegerDistribution
smile.stat.distribution.Distribution
Here is the code how to fill histograms using a number of popular distributions, including discrete:
with the code;
= Description of discrete distributions
DataMelt can be used to determine statistical characteristics of an arbitrary frequency distribution, with moments calculated up to the 6th order. Read more about
Moment (mathematics).
<pycode"> from jhplot import * from jhplot.math.StatisticSample import * a=randomLogNormal(1000,0,10) # generate random 1000 numbers between 0 and 10 using a LogNormal distribution p0=P0D(a) # convert it to an array print p0.getStatString() # print detailed characteristics </pycode>
Run this script and you will get a very detailed information about this distribution (rather self-explanatory)
To show this output, click expand
Let us continue with this example and now we would like to return all statistical characteristics of the sample as a dictionary. We can do this by appending the following lines that 1) create a dictionary "stat" with key/value pairs; 2) retrieve a variance of the sample using the key ``Variance.
To display this code request membership or login if you are already member. |
which will print "Variance= 757.3". If not sure about the names of the keys, simply print the dictionary as "print stat".
One can create histograms that catch the most basic characteristics of data. This is especially important if there is no particular reasons to deal with complete data arrays. We can easily do this with above Fibonacci sequence as:
To display this code request membership or login if you are already member. |
The code converts the array into a histogram with 10 equidistant bins in the range 0-100, and then it prints the map with statistical characteristics.
You can also visualize the random numbers in the form of a histogram as shown in this detailed example above. We create random numbers, convert them to histograms and plot them.
To display this code request membership or login if you are already member. |
Statistics with 2D arrays
You can get detailed statistics on data described by jhplot.P1D class using the method getStat(axis), where axis=0 for X and axis=1 for Y.
It returns a map (for JAVA) or Python dictionary (for Jython) where each statistical characteristics can be accessed using a key, such as mean, RMS, variance, error on the mean at. Assuming that P1D is represented by "p1" object, try this code:
To display this code request membership or login if you are already member. |
This will print the following values:
error 0.996592835069 rms 5.05682000584 mean 4.42857142857 variance 6.95238095238 stddev 2.63673679998
Here is a more detailed example: