Compound Poisson distribution

From HandWiki
Short description: Aspect of probability theory

In probability theory, a compound Poisson distribution is the probability distribution of the sum of a number of independent identically-distributed random variables, where the number of terms to be added is itself a Poisson-distributed variable. The result can be either a continuous or a discrete distribution.

Definition

Suppose that

[math]\displaystyle{ N\sim\operatorname{Poisson}(\lambda), }[/math]

i.e., N is a random variable whose distribution is a Poisson distribution with expected value λ, and that

[math]\displaystyle{ X_1, X_2, X_3, \dots }[/math]

are identically distributed random variables that are mutually independent and also independent of N. Then the probability distribution of the sum of [math]\displaystyle{ N }[/math] i.i.d. random variables

[math]\displaystyle{ Y = \sum_{n=1}^N X_n }[/math]

is a compound Poisson distribution.

In the case N = 0, then this is a sum of 0 terms, so the value of Y is 0. Hence the conditional distribution of Y given that N = 0 is a degenerate distribution.

The compound Poisson distribution is obtained by marginalising the joint distribution of (Y,N) over N, and this joint distribution can be obtained by combining the conditional distribution Y | N with the marginal distribution of N.

Properties

The expected value and the variance of the compound distribution can be derived in a simple way from law of total expectation and the law of total variance. Thus

[math]\displaystyle{ \operatorname{E}(Y)= \operatorname{E}\left[\operatorname{E}(Y \mid N)\right]= \operatorname{E}\left[N \operatorname{E}(X)\right]= \operatorname{E}(N) \operatorname{E}(X) , }[/math]
[math]\displaystyle{ \begin{align} \operatorname{Var}(Y) & = \operatorname{E}\left[\operatorname{Var}(Y\mid N)\right] + \operatorname{Var}\left[\operatorname{E}(Y \mid N)\right] =\operatorname{E} \left[N\operatorname{Var}(X)\right] + \operatorname{Var}\left[N\operatorname{E}(X)\right] , \\[6pt] & = \operatorname{E}(N)\operatorname{Var}(X) + \left(\operatorname{E}(X) \right)^2 \operatorname{Var}(N). \end{align} }[/math]

Then, since E(N) = Var(N) if N is Poisson-distributed, these formulae can be reduced to

[math]\displaystyle{ \operatorname{E}(Y)= \operatorname{E}(N)\operatorname{E}(X) , }[/math]
[math]\displaystyle{ \operatorname{Var}(Y) = \operatorname{E}(N)(\operatorname{Var}(X) + (\operatorname{E}(X))^2)= \operatorname{E}(N){\operatorname{E}(X^2)}. }[/math]

The probability distribution of Y can be determined in terms of characteristic functions:

[math]\displaystyle{ \varphi_Y(t) = \operatorname{E}(e^{itY})= \operatorname{E} \left( \left(\operatorname{E} (e^{itX}\mid N) \right)^N \right)= \operatorname{E} \left((\varphi_X(t))^N\right), \, }[/math]

and hence, using the probability-generating function of the Poisson distribution, we have

[math]\displaystyle{ \varphi_Y(t) = \textrm{e}^{\lambda(\varphi_X(t) - 1)}.\, }[/math]

An alternative approach is via cumulant generating functions:

[math]\displaystyle{ K_Y(t)=\ln \operatorname{E}[e^{tY}]=\ln \operatorname E[\operatorname E[e^{tY}\mid N]]=\ln \operatorname E[e^{NK_X(t)}]=K_N(K_X(t)) . \, }[/math]

Via the law of total cumulance it can be shown that, if the mean of the Poisson distribution λ = 1, the cumulants of Y are the same as the moments of X1.[citation needed]

It can be shown that every infinitely divisible probability distribution is a limit of compound Poisson distributions.[1] And compound Poisson distributions is infinitely divisible by the definition.

Discrete compound Poisson distribution

When [math]\displaystyle{ X_1, X_2, X_3, \dots }[/math] are positive integer-valued i.i.d random variables with [math]\displaystyle{ P(X_1 = k) = \alpha_k,\ (k =1,2, \ldots ) }[/math], then this compound Poisson distribution is named discrete compound Poisson distribution[2][3][4] (or stuttering-Poisson distribution[5]) . We say that the discrete random variable [math]\displaystyle{ Y }[/math] satisfying probability generating function characterization

[math]\displaystyle{ P_Y(z) = \sum\limits_{i = 0}^\infty P(Y = i)z^i = \exp\left(\sum\limits_{k = 1}^\infty \alpha_k \lambda (z^k - 1)\right), \quad (|z| \le 1) }[/math]

has a discrete compound Poisson(DCP) distribution with parameters [math]\displaystyle{ (\alpha_1 \lambda,\alpha_2 \lambda, \ldots ) \in \mathbb{R}^\infty }[/math] (where [math]\displaystyle{ \sum_{i = 1}^\infty \alpha_i = 1 }[/math], with [math]\displaystyle{ \alpha_i \ge 0,\lambda \gt 0 }[/math]), which is denoted by

[math]\displaystyle{ X \sim {\text{DCP}}(\lambda {\alpha _1},\lambda {\alpha _2}, \ldots ) }[/math]

Moreover, if [math]\displaystyle{ X \sim {\operatorname{DCP}}(\lambda {\alpha _1}, \ldots ,\lambda {\alpha _r}) }[/math], we say [math]\displaystyle{ X }[/math] has a discrete compound Poisson distribution of order [math]\displaystyle{ r }[/math] . When [math]\displaystyle{ r = 1,2 }[/math], DCP becomes Poisson distribution and Hermite distribution, respectively. When [math]\displaystyle{ r = 3,4 }[/math], DCP becomes triple stuttering-Poisson distribution and quadruple stuttering-Poisson distribution, respectively.[6] Other special cases include: shift geometric distribution, negative binomial distribution, Geometric Poisson distribution, Neyman type A distribution, Luria–Delbrück distribution in Luria–Delbrück experiment. For more special case of DCP, see the reviews paper[7] and references therein.

Feller's characterization of the compound Poisson distribution states that a non-negative integer valued r.v. [math]\displaystyle{ X }[/math] is infinitely divisible if and only if its distribution is a discrete compound Poisson distribution.[8] It can be shown that the negative binomial distribution is discrete infinitely divisible, i.e., if X has a negative binomial distribution, then for any positive integer n, there exist discrete i.i.d. random variables X1, ..., Xn whose sum has the same distribution that X has. The shift geometric distribution is discrete compound Poisson distribution since it is a trivial case of negative binomial distribution.

This distribution can model batch arrivals (such as in a bulk queue[5][9]). The discrete compound Poisson distribution is also widely used in actuarial science for modelling the distribution of the total claim amount.[3]

When some [math]\displaystyle{ \alpha_k }[/math] are negative, it is the discrete pseudo compound Poisson distribution.[3] We define that any discrete random variable [math]\displaystyle{ Y }[/math] satisfying probability generating function characterization

[math]\displaystyle{ G_Y(z) = \sum\limits_{i = 0}^\infty P(Y = i)z^i = \exp\left(\sum\limits_{k = 1}^\infty \alpha_k \lambda (z^k - 1)\right), \quad (|z| \le 1) }[/math]

has a discrete pseudo compound Poisson distribution with parameters [math]\displaystyle{ (\lambda_1 ,\lambda_2, \ldots )=:(\alpha_1 \lambda,\alpha_2 \lambda, \ldots ) \in \mathbb{R}^\infty }[/math] where [math]\displaystyle{ \sum_{i = 1}^\infty {\alpha_i} = 1 }[/math] and [math]\displaystyle{ \sum_{i = 1}^\infty {\left| {{\alpha _i}} \right|} \lt \infty }[/math], with [math]\displaystyle{ {\alpha_i} \in \mathbb{R},\lambda \gt 0 }[/math].

Compound Poisson Gamma distribution

If X has a gamma distribution, of which the exponential distribution is a special case, then the conditional distribution of Y | N is again a gamma distribution. The marginal distribution of Y can be shown to be a Tweedie distribution[10] with variance power 1 < p < 2 (proof via comparison of characteristic function (probability theory)). To be more explicit, if

[math]\displaystyle{ N \sim\operatorname{Poisson}(\lambda) , }[/math]

and

[math]\displaystyle{ X_i \sim \operatorname{\Gamma}(\alpha, \beta) }[/math]

i.i.d., then the distribution of

[math]\displaystyle{ Y = \sum_{i=1}^N X_i }[/math]

is a reproductive exponential dispersion model [math]\displaystyle{ ED(\mu, \sigma^2) }[/math] with

[math]\displaystyle{ \begin{align} \operatorname{E}[Y] & = \lambda \frac{\alpha}{\beta} =: \mu , \\[4pt] \operatorname{Var}[Y]& = \lambda \frac{\alpha(1+\alpha)}{\beta^2}=: \sigma^2 \mu^p . \end{align} }[/math]

The mapping of parameters Tweedie parameter [math]\displaystyle{ \mu, \sigma^2, p }[/math] to the Poisson and Gamma parameters [math]\displaystyle{ \lambda, \alpha, \beta }[/math] is the following:

[math]\displaystyle{ \begin{align} \lambda &= \frac{\mu^{2-p}}{(2-p)\sigma^2} , \\[4pt] \alpha &= \frac{2-p}{p-1} , \\[4pt] \beta &= \frac{\mu^{1-p}}{(p-1)\sigma^2} . \end{align} }[/math]

Compound Poisson processes

Main page: Compound Poisson process

A compound Poisson process with rate [math]\displaystyle{ \lambda\gt 0 }[/math] and jump size distribution G is a continuous-time stochastic process [math]\displaystyle{ \{\,Y(t) : t \geq 0 \,\} }[/math] given by

[math]\displaystyle{ Y(t) = \sum_{i=1}^{N(t)} D_i, }[/math]

where the sum is by convention equal to zero as long as N(t) = 0. Here, [math]\displaystyle{ \{\,N(t) : t \geq 0\,\} }[/math] is a Poisson process with rate [math]\displaystyle{ \lambda }[/math], and [math]\displaystyle{ \{\,D_i : i \geq 1\,\} }[/math] are independent and identically distributed random variables, with distribution function G, which are also independent of [math]\displaystyle{ \{\,N(t) : t \geq 0\,\}.\, }[/math][11]

For the discrete version of compound Poisson process, it can be used in survival analysis for the frailty models.[12]

Applications

A compound Poisson distribution, in which the summands have an exponential distribution, was used by Revfeim to model the distribution of the total rainfall in a day, where each day contains a Poisson-distributed number of events each of which provides an amount of rainfall which has an exponential distribution.[13] Thompson applied the same model to monthly total rainfalls.[14]

There have been applications to insurance claims[15][16] and x-ray computed tomography.[17][18][19]

See also


References

  1. Lukacs, E. (1970). Characteristic functions. London: Griffin.
  2. Johnson, N.L., Kemp, A.W., and Kotz, S. (2005) Univariate Discrete Distributions, 3rd Edition, Wiley, ISBN:978-0-471-27246-5.
  3. 3.0 3.1 3.2 Huiming, Zhang; Yunxiao Liu; Bo Li (2014). "Notes on discrete compound Poisson model with applications to risk theory". Insurance: Mathematics and Economics 59: 325–336. doi:10.1016/j.insmatheco.2014.09.012. 
  4. Huiming, Zhang; Bo Li (2016). "Characterizations of discrete compound Poisson distributions". Communications in Statistics - Theory and Methods 45 (22): 6789–6802. doi:10.1080/03610926.2014.901375. 
  5. 5.0 5.1 Kemp, C. D. (1967). ""Stuttering – Poisson" distributions". Journal of the Statistical and Social Enquiry of Ireland 21 (5): 151–157. 
  6. Patel, Y. C. (1976). Estimation of the parameters of the triple and quadruple stuttering-Poisson distributions. Technometrics, 18(1), 67-73.
  7. Wimmer, G., Altmann, G. (1996). The multiple Poisson distribution, its characteristics and a variety of forms. Biometrical journal, 38(8), 995-1011.
  8. Feller, W. (1968). An Introduction to Probability Theory and its Applications. I (3rd ed.). New York: Wiley. 
  9. Adelson, R. M. (1966). "Compound Poisson Distributions". Journal of the Operational Research Society 17 (1): 73–75. doi:10.1057/jors.1966.8. 
  10. Jørgensen, Bent (1997). The theory of dispersion models. Chapman & Hall. ISBN 978-0412997112. 
  11. S. M. Ross (2007). Introduction to Probability Models (ninth ed.). Boston: Academic Press. ISBN 978-0-12-598062-3. 
  12. Ata, N.; Özel, G. (2013). "Survival functions for the frailty models based on the discrete compound Poisson process". Journal of Statistical Computation and Simulation 83 (11): 2105–2116. doi:10.1080/00949655.2012.679943. 
  13. Revfeim, K. J. A. (1984). "An initial model of the relationship between rainfall events and daily rainfalls". Journal of Hydrology 75 (1–4): 357–364. doi:10.1016/0022-1694(84)90059-3. Bibcode1984JHyd...75..357R. 
  14. Thompson, C. S. (1984). "Homogeneity analysis of a rainfall series: an application of the use of a realistic rainfall model". J. Climatology 4 (6): 609–619. doi:10.1002/joc.3370040605. Bibcode1984IJCli...4..609T. 
  15. Jørgensen, Bent; Paes De Souza, Marta C. (January 1994). "Fitting Tweedie's compound poisson model to insurance claims data". Scandinavian Actuarial Journal 1994 (1): 69–93. doi:10.1080/03461238.1994.10413930. 
  16. Smyth, Gordon K.; Jørgensen, Bent (29 August 2014). "Fitting Tweedie's Compound Poisson Model to Insurance Claims Data: Dispersion Modelling". ASTIN Bulletin 32 (1): 143–157. doi:10.2143/AST.32.1.1020. 
  17. Whiting, Bruce R. (3 May 2002). "Signal statistics in x-ray computed tomography". Medical Imaging 2002: Physics of Medical Imaging (International Society for Optics and Photonics) 4682: 53–60. doi:10.1117/12.465601. Bibcode2002SPIE.4682...53W. 
  18. Elbakri, Idris A.; Fessler, Jeffrey A. (16 May 2003). Sonka, Milan; Fitzpatrick, J. Michael. eds. "Efficient and accurate likelihood for iterative image reconstruction in x-ray computed tomography". Medical Imaging 2003: Image Processing (SPIE) 5032: 1839–1850. doi:10.1117/12.480302. Bibcode2003SPIE.5032.1839E. 
  19. Whiting, Bruce R.; Massoumzadeh, Parinaz; Earl, Orville A.; O'Sullivan, Joseph A.; Snyder, Donald L.; Williamson, Jeffrey F. (24 August 2006). "Properties of preprocessed sinogram data in x-ray computed tomography". Medical Physics 33 (9): 3290–3303. doi:10.1118/1.2230762. PMID 17022224. Bibcode2006MedPh..33.3290W.