Zero-truncated Poisson distribution

From HandWiki
Short description: Conditional Poisson distribution restricted to positive integers


In probability theory, the zero-truncated Poisson (ZTP) distribution is a certain discrete probability distribution whose support is the set of positive integers. This distribution is also known as the conditional Poisson distribution[1] or the positive Poisson distribution.[2] It is the conditional probability distribution of a Poisson-distributed random variable, given that the value of the random variable is not zero. Thus it is impossible for a ZTP random variable to be zero. Consider for example the random variable of the number of items in a shopper's basket at a supermarket checkout line. Presumably a shopper does not stand in line with nothing to buy (i.e., the minimum purchase is 1 item), so this phenomenon may follow a ZTP distribution.[3]

Since the ZTP is a truncated distribution with the truncation stipulated as k > 0, one can derive the probability mass function g(k;λ) from a standard Poisson distribution f(k;λ) as follows: [4]

[math]\displaystyle{ g(k;\lambda) = P(X = k \mid X \gt 0) = \frac{f(k;\lambda)}{1-f(0;\lambda)} = \frac{\lambda ^ k e^{- \lambda} }{k ! \left ( 1 - e^{- \lambda} \right )} = \frac{\lambda^k}{(e^\lambda-1)k!} }[/math]c

The mean is

[math]\displaystyle{ \operatorname{E}[X]=\frac{\lambda}{1-e^{-\lambda}}=\frac{\lambda e^\lambda}{e^\lambda-1} }[/math]

and the variance is

[math]\displaystyle{ \operatorname{Var}[X]=\frac{\lambda+\lambda^2}{1-e^{-\lambda}} - \frac{\lambda^2 }{(1-e^{-\lambda})^2} = \operatorname{E}[X](1+\lambda-\operatorname{E}[X]) }[/math]

Parameter estimation

The method of moments estimator [math]\displaystyle{ \widehat{\lambda} }[/math] for the parameter [math]\displaystyle{ \lambda }[/math] is obtained by solving

[math]\displaystyle{ \frac{\widehat{\lambda}}{1-e^{-\widehat{\lambda}}} = \bar{x} }[/math]

where [math]\displaystyle{ \bar{x} }[/math] is the sample mean.[1]

This equation does not have a closed-form solution. In practice, a solution may be found using numerical methods.

Examples

Insurance claims:

Imagine navigating the intricate landscape of auto insurance claims, where each claim signifies a unique event – an accident or damage occurrence. The ZTP distribution seamlessly aligns with this scenario, excluding the possibility of policyholders with zero claims.

Let X denote the random variable representing the number of insurance claims. If λ is the average rate of claims, the ZTP probability mass function takes the form:

[math]\displaystyle{ P(X=k)=\frac{\lambda ^ k e^{- \lambda} }{k ! \left ( 1 - e^{- \lambda} \right )} }[/math] for k= 1,2,3,...

This formula encapsulates the probability of observing k claims given that at least one claim has transpired. The denominator ensures the exclusion of the improbable zero-claim scenario. By utilizing the zero-truncated Poisson distribution, the manufacturing company can analyze and predict the frequency of defects in their products while focusing on instances where defects exist. This distribution helps in understanding and improving the quality control process, especially when it's crucial to account for at least one defect.

Comparison with Poisson Distribution

The zero-truncated Poisson distribution (ZTP) and the standard Poisson distribution offer distinct approaches to modeling random events. In the case of the standard Poisson distribution, it allows for the occurrence of zero events, making it suitable for situations where the event of interest may or may not happen. This distribution has a support that spans non-negative integers (0, 1, 2, ...), and both its mean and variance are equal to the rate parameter (λ). It is commonly employed in scenarios where occurrences are independent and the average rate is known, such as modeling phone call arrivals, radioactive decay events, or website hits.

On the other hand, the zero-truncated Poisson distribution (ZTP) takes a more specialized approach. As its name suggests, it excludes the possibility of zero events, modeling situations where the event is certain to occur at least once, and counting starts from the first occurrence. The support of the ZTP distribution is restricted to positive integers (1, 2, 3, ...), and its mean and variance are slightly higher than the rate parameter (λ) due to the exclusion of zero. This distribution finds its niche in scenarios where zero occurrences are impossible or not meaningful, such as modeling insurance claims, product defects, or emergency room visits. The probability mass function (PMF) for the ZTP distribution reflects this exclusion, ensuring that the counting starts from one and adjusting the probabilities accordingly. [5]

Applications in Various Fields

Probability Mass Function (p.m.f) of the Zero-Truncated Poisson-Lindley Distribution (ZTPLD) for θ = 0.7. The distribution showcases the unique characteristics of the zero-truncated variant.

The applications of the zero-truncated Poisson distribution (ZTP) span various fields, showcasing its versatility in capturing positive integer counting processes that commence from a minimum count of one. In the financial realm, ZTP finds utility in modeling the number of transactions or trades in markets, where occurrences are nearly certain and the counting starts from the first trade. Ecological studies benefit from ZTP when analyzing the distribution of offspring per reproductive event, acknowledging that births typically begin with at least one offspring. Social sciences, particularly in the study of human behavior, leverage ZTP to model the frequency of events like voting or purchasing, where occurrences initiate from a minimum count. In epidemiology, ZTP proves valuable for investigating disease spread and modeling the number of new cases, as diseases typically commence with at least one case. [6]

Real-World Data Analysis

Real-world data analysis employing the zero-truncated Poisson distribution reveals its practical significance. For instance, in insurance claim data analysis, ZTP enhances insights into the distribution of claims by focusing on scenarios where policyholders have at least one claim. In the context of product quality control, ZTP is applied to model the number of defects per product, excluding instances where products are defect-free and counting begins from the first defect. Emergency room patient arrival records benefit from ZTP, offering a more accurate representation of patient arrivals with health issues, excluding instances where individuals seek medical attention without specific health concerns. In biological studies, ZTP is utilized to model the distribution of offspring, acknowledging that reproduction typically starts with the birth of at least one offspring. In essence, ZTP emerges as a powerful statistical tool, contributing to a nuanced understanding of positive integer counting processes in diverse applications.

Bayesian Perspective

In Bayesian statistics, the zero-truncated Poisson distribution finds its place in the realm of posterior predictive modeling. Bayesian inference allows for the incorporation of prior beliefs about the parameter of interest, and the ZTP distribution seamlessly fits into this framework. By combining observed data with prior knowledge, Bayesian statisticians can derive a posterior distribution that reflects updated beliefs about the underlying parameters. The zero-truncated Poisson distribution, with its focus on positive integers and exclusion of zero occurrences, aligns with the Bayesian approach when modeling count data. Bayesian techniques provide a flexible avenue for incorporating uncertainty and refining estimates based on both observed data and prior information. This integration of the zero-truncated Poisson distribution into Bayesian methodologies contributes to a more robust statistical toolkit for researchers and practitioners in diverse fields. [7]

Generating zero-truncated Poisson-distributed random variables

Random variables sampled from the Zero-truncated Poisson distribution may be achieved using algorithms derived from Poisson distributig sampling algorithms.[8]

init:
     Let k ← 1, t ← e−λ / (1 - e−λ) * λ, s ← t.
     Generate uniform random number u in [0,1].
while s < u do:
     k ← k + 1.
     t ← t * λ / k.
     s ← s + t.
return k.

The cost of the procedure above is linear in k, which may be large for large values of [math]\displaystyle{ \lambda }[/math]. Given access to an efficient sampler for non-truncated Poisson random variates, a non-iterative approach involves sampling from a truncated exponential distribution representing the time of the first event in a Poisson point process, conditional on such an event existing.[9] A simple NumPy implementation is:

def sample_zero_truncated_poisson(rate):
    u = np.random.uniform(np.exp(-rate), 1)
    t = -np.log(u)
    return 1 + np.random.poisson(rate - t)

References

  1. 1.0 1.1 Cohen, A. Clifford (1960). "Estimating parameters in a conditional Poisson distribution". Biometrics 16 (2): 203–211. doi:10.2307/2527552. 
  2. Singh, Jagbir (1978). "A characterization of positive Poisson distribution and its application". SIAM Journal on Applied Mathematics 34: 545–548. doi:10.1137/0134043. 
  3. "Stata Data Analysis Examples: Zero-Truncated Poisson Regression". UCLA Institute for Digital Research and Education. http://www.ats.ucla.edu/stat/stata/dae/ztp.htm. 
  4. Johnson, Norman L.; Kemp, Adrianne W.; Kotz, Samuel (2005). Univariate Discrete Distributions (third ed.). Hoboken, NJ: Wiley-Interscience. 
  5. Bidram, Hamid (December 2019). "Geometric-Zero Truncated Poisson Distribution: Properties and Applications". https://www.researchgate.net/publication/337668852_Geometric-Zero_Truncated_Poisson_Distribution_Properties_and_Applications. 
  6. Shanker, Rama (January 2020). "Zero-truncated Poisson-Ishita Distribution and Its Applications". https://www.researchgate.net/publication/343176019_Zero-truncated_Poisson-Ishita_Distribution_and_Its_Applications. 
  7. Hassan, Anwar (January 2008). "On the bayes estimator of parameter and reliability function of the zero-function of the zero-truncated Poisson Distribution.". https://www.researchgate.net/publication/264140277_ON_THE_BAYES_ESTIMATOR_OF_PARAMETER_AND_RELIABILITY_FUNCTION_OF_THE_ZERO-TRUNCATED_POISSON_DISTRIBUTION. 
  8. Borje, Gio (2016-06-01). "Zero-Truncated Poisson Distribution Sampling Algorithm". http://giocc.com/zero_truncated_poisson_sampling_algorithm.html. 
  9. Hardie, Ted (1 May 2005). "[R] simulate zero-truncated Poisson distribution". r-help (Mailing list). Retrieved 27 May 2022.