Irwin–Hall distribution

From HandWiki
Short description: Probability distribution
Irwin–Hall distribution
Probability density function
Probability mass function for the distribution
Cumulative distribution function
Cumulative distribution function for the distribution
Parameters nN0
Support [math]\displaystyle{ x \in [0,n] }[/math]
PDF [math]\displaystyle{ \frac{1}{(n-1)!}\sum_{k=0}^{\lfloor x\rfloor}(-1)^k\binom{n}{k}(x-k)^{n-1} }[/math]
CDF [math]\displaystyle{ \frac{1}{n!}\sum_{k=0}^{\lfloor x\rfloor}(-1)^k\binom{n}{k}(x-k)^n }[/math]
Mean [math]\displaystyle{ \frac{n}{2} }[/math]
Median [math]\displaystyle{ \frac{n}{2} }[/math]
Mode [math]\displaystyle{ \begin{cases} \text{any value in } [0,1] & \text{for } n=1 \\ \frac{n}{2} & \text{otherwise} \end{cases} }[/math]
Variance [math]\displaystyle{ \frac{n}{12} }[/math]
Skewness 0
Kurtosis [math]\displaystyle{ -\tfrac{6}{5n} }[/math]
MGF [math]\displaystyle{ {\left(\frac{\mathrm{e}^t-1} t\right)}^n }[/math]
CF [math]\displaystyle{ {\left(\frac{\mathrm{e}^{it}-1}{it}\right)}^n }[/math]

In probability and statistics, the Irwin–Hall distribution, named after Joseph Oscar Irwin and Philip Hall, is a probability distribution for a random variable defined as the sum of a number of independent random variables, each having a uniform distribution.[1] For this reason it is also known as the uniform sum distribution.

The generation of pseudo-random numbers having an approximately normal distribution is sometimes accomplished by computing the sum of a number of pseudo-random numbers having a uniform distribution; usually for the sake of simplicity of programming. Rescaling the Irwin–Hall distribution provides the exact distribution of the random variates being generated.

This distribution is sometimes confused with the Bates distribution, which is the mean (not sum) of n independent random variables uniformly distributed from 0 to 1.

Definition

The Irwin–Hall distribution is the continuous probability distribution for the sum of n independent and identically distributed U(0, 1) random variables:

[math]\displaystyle{ X = \sum_{k=1}^n U_k. }[/math]

The probability density function (pdf) for [math]\displaystyle{ 0\leq x\leq n }[/math] is given by

[math]\displaystyle{ f_X(x;n)=\frac{1}{(n-1)!}\sum_{k=0}^n (-1)^k{n \choose k} (x-k)_+^{n-1} }[/math]

where [math]\displaystyle{ (x-k)_+ }[/math] denotes the positive part of the expression:

[math]\displaystyle{ (x-k)_+ = \begin{cases} x-k & x-k \geq 0 \\ 0 & x-k \lt 0.\end{cases} }[/math]

Thus the pdf is a spline (piecewise polynomial function) of degree n − 1 over the knots 0, 1, ..., n. In fact, for x between the knots located at k and k + 1, the pdf is equal to

[math]\displaystyle{ f_X(x;n) = \frac{1}{(n-1)!}\sum_{j=0}^{n-1} a_j(k,n) x^j }[/math]

where the coefficients aj(k,n) may be found from a recurrence relation over k

[math]\displaystyle{ a_j(k,n)=\begin{cases} 1&k=0, j=n-1\\ 0&k=0, j\lt n-1\\ a_j(k-1,n) + (-1)^{n+k-j-1}{n\choose k}{{n-1}\choose j}k^{n-j-1} &k\gt 0\end{cases} }[/math]

The coefficients are also A188816 in OEIS. The coefficients for the cumulative distribution is A188668.

The mean and variance are n/2 and n/12, respectively.

Special cases

[math]\displaystyle{ f_X(x)= \begin{cases} 1 & 0\le x \le 1 \\ 0 & \text{otherwise} \end{cases} }[/math]
[math]\displaystyle{ f_X(x)= \begin{cases} x & 0\le x \le 1\\ 2-x & 1\le x \le 2 \end{cases} }[/math]
  • For n = 3,
[math]\displaystyle{ f_X(x)= \begin{cases} \frac{1}{2}x^2 & 0\le x \le 1\\ \frac{1}{2}(-2x^2 + 6x - 3)& 1\le x \le 2\\ \frac{1}{2}(3 - x)^2 & 2\le x \le 3 \end{cases} }[/math]
  • For n = 4,
[math]\displaystyle{ f_X(x)= \begin{cases} \frac{1}{6}x^3 & 0\le x \le 1\\ \frac{1}{6}(-3x^3 + 12x^2 - 12x+4)& 1\le x \le 2\\ \frac{1}{6}(3x^3 - 24x^2 +60x-44) & 2\le x \le 3\\ \frac{1}{6}(4 - x)^3 & 3\le x \le 4 \end{cases} }[/math]
  • For n = 5,
[math]\displaystyle{ f_X(x)= \begin{cases} \frac{1}{24}x^4 & 0\le x \le 1\\ \frac{1}{24}(-4x^4 + 20x^3 - 30x^2+20x-5)& 1\le x \le 2\\ \frac{1}{24}(6x^4-60x^3+210x^2-300x+155) & 2\le x \le 3\\ \frac{1}{24}(-4x^4+60x^3-330x^2+780x-655) & 3\le x \le 4\\ \frac{1}{24}(5 - x)^4 &4\le x\le5 \end{cases} }[/math]

Approximating a Normal distribution

By the Central Limit Theorem, as n increases, the Irwin–Hall distribution more and more strongly approximates a Normal distribution with mean [math]\displaystyle{ \mu=n/2 }[/math] and variance [math]\displaystyle{ \sigma^2=n/12 }[/math]. To approximate the standard Normal distribution [math]\displaystyle{ \phi(x)=\mathcal{N}(\mu=0, \sigma^2=1) }[/math], the Irwin–Hall distribution can be centered by shifting it by its mean of n/2, and scaling the result by the square root of its variance:

[math]\displaystyle{ \phi(x) \overset{n\gg 0}{\approx} \sqrt{\frac{n}{12}} f_X\left(x\sqrt{\frac{n}{12}}+\frac{n}{2};n \right ) }[/math]

This derivation leads to a computationally simple heuristic that removes the square root, whereby a standard Normal distribution can be approximated with the sum of 12 uniform U(0,1) draws like so:

[math]\displaystyle{ \sum_{k=1}^{12}U_k -6 \sim f_X(x+6;12) \mathrel{\dot\sim} \phi(x) }[/math]

Similar and related distributions

The Irwin–Hall distribution is similar to the Bates distribution, but still featuring only integers as parameter. An extension to real-valued parameters is possible by adding also a random uniform variable with N − trunc(N) as width.

Extensions to the Irwin–Hall distribution

When using the Irwin–Hall for data fitting purposes one problem is that the IH is not very flexible because the parameter n needs to be an integer. However, instead of summing n equal uniform distributions, we could also add e.g. U + 0.5U to address also the case n = 1.5 (giving a trapezoidal distribution).

The Irwin–Hall distribution has an application to beamforming and pattern synthesis in Figure 1 of reference [2][3]

See also

Notes

  1. Johnson, N.L.; Kotz, S.; Balakrishnan, N. (1995) Continuous Univariate Distributions, Volume 2, 2nd Edition, Wiley ISBN:0-471-58494-0(Section 26.9)
  2. "Sidelobe behavior and bandwidth characteristics of distributed antenna arrays". January 2018. pp. 1–2. https://ieeexplore.ieee.org/document/8299700/similar#similar. 
  3. https://www.usnc-ursi-archive.org/nrsm/2018/papers/B15-9.pdf[bare URL PDF]

References

  • Hall, Philip. (1927) "The Distribution of Means for Samples of Size N Drawn from a Population in which the Variate Takes Values Between 0 and 1, All Such Values Being Equally Probable". Biometrika, Vol. 19, No. 3/4., pp. 240–245. doi:10.1093/biomet/19.3-4.240 JSTOR 2331961
  • Irwin, J.O. (1927) "On the Frequency Distribution of the Means of Samples from a Population Having any Law of Frequency with Finite Moments, with Special Reference to Pearson's Type II". Biometrika, Vol. 19, No. 3/4., pp. 225–239. doi:10.1093/biomet/19.3-4.225 JSTOR 2331960