Discrete Weibull distribution

From HandWiki
Discrete Weibull
Parameters [math]\displaystyle{ \alpha\gt 0 }[/math] scale
[math]\displaystyle{ \beta \gt 0 }[/math] shape
Support [math]\displaystyle{ x \in \{0, 1,2,\ldots\} }[/math]
pmf [math]\displaystyle{ \exp\left[-\left(\frac{x }{\alpha}\right)^\beta \right]- \exp\left[-\left(\frac{x+1}{\alpha}\right)^\beta \right] }[/math]
CDF [math]\displaystyle{ 1-\exp\left[-\left(\frac{x+1}{\alpha}\right)^\beta \right] }[/math]

In probability theory and statistics, the discrete Weibull distribution is the discrete variant of the Weibull distribution. The Discrete Weibull Distribution, first introduced by Toshio Nakagawa and Shunji Osaki, is a discrete analog of the continuous Weibull distribution, predominantly used in reliability engineering. It is particularly applicable for modeling failure data measured in discrete units like cycles or shocks. This distribution provides a versatile tool for analyzing scenarios where the timing of events is counted in distinct intervals, making it distinctively useful in fields that deal with discrete data patterns and reliability analysis.


Alternative parametrizations

In the original paper by Nakagawa and Osaki they used the parametrization [math]\displaystyle{ p = q^{k^{-\beta}} }[/math] making the cumulative distribution function [math]\displaystyle{ 1-q^{(x+1)^\beta} }[/math]

The CDF of the Discrete Weibull Distribution with a q value of 0.5 and k values of 1 through 5. The B values are as follows: Red = 0.5, Green = 1.0, Blue = 1.5, Purple = 2.0, Orange = 2.5.

with [math]\displaystyle{ q \in (0,1) }[/math] and the probability mass function [math]\displaystyle{ q^{k^{-\beta}}- q^{(k+1)^\beta} }[/math]

The PMF of the Discrete Weibull Distribution with a q value of 0.5 and k values of 1 through 5. The B values are as follows: Red = 0.5, Green = 1.0, Blue = 1.5, Purple = 2.0, Orange = 2.5.

. Setting [math]\displaystyle{ \beta=1 }[/math] makes the relationship with the geometric distribution apparent.[1]

An alternative parametrization — related to the Pareto distribution — has been used to estimate parameters in infectious disease modelling.[2] This parametrization introduces a parameter [math]\displaystyle{ \kappa=\frac{\beta}{\alpha^\beta} }[/math], meaning that the term [math]\displaystyle{ \left(\frac{1}{\alpha}\right)^\beta }[/math] can be replaced with [math]\displaystyle{ \frac{\kappa}{\beta} }[/math]. Therefore, the probability mass function can be expressed as

[math]\displaystyle{ \exp\left[-\frac{\kappa x^\beta}{\beta} \right] - \exp\left[-\frac{\kappa \left(x+1 \right)^\beta}{\beta} \right] }[/math],

and the cumulative mass function can be expressed as

[math]\displaystyle{ 1-\exp\left[-\frac{\kappa \left(x+1 \right)^\beta}{\beta} \right] }[/math].

Location-scale transformation

The continuous Weibull distribution has a close relationship with the Gumbel distribution which is easy to see when log-transforming the variable. A similar transformation can be made on the discrete Weibull.

Define [math]\displaystyle{ e^Y-1 = X }[/math] where (unconventionally) [math]\displaystyle{ Y =\log(X+1)\in \{ \log(1), \log(2), \ldots \} }[/math] and define parameters [math]\displaystyle{ \mu = \log(\alpha) }[/math] and [math]\displaystyle{ \sigma = \frac{1}{\beta} }[/math]. By replacing [math]\displaystyle{ x }[/math] in the cumulative mass function:

[math]\displaystyle{ \Pr(X\leq x) = \Pr(X\leq e^y-1). }[/math]

We see that we get a location-scale parametrization:

[math]\displaystyle{ = 1-\exp\left[-\left(\frac{x+1}{\alpha}\right)^\beta \right] = 1-\exp\left[-\left(\frac{e^y}{e^\mu} \right)^\frac{1}{\sigma} \right] = 1-\exp\left[-\exp\left[\frac{y-\mu}{\sigma}\right] \right] }[/math]

which in estimation settings makes a lot of sense. This opens up the possibility of regression with frameworks developed for Weibull regression and extreme-value-theory. [3]

Comparison to Other Discrete Distributions

The discrete Weibull distribution can be compared with other common discrete distributions such as the Poisson, geometric, and negative binomial distributions, each of which has unique characteristics and applications.

Discrete Weibull vs. Poisson Distribution: The Poisson distribution is often used to model the number of rare event occurrences during a fixed period of time. It is characterized by a single parameter, λ, which is both the mean and variance of the distribution. The discrete Weibull distribution, on the other hand, is more flexible and can handle both over- and under-dispersion in count data. It has two parameters, q and β, which influence the shape and scale of the distribution. Unlike the Poisson distribution, which assumes events occur independently, the discrete Weibull can adapt to different event occurrence patterns.

Discrete Weibull vs. Geometric Distribution: The geometric distribution models the probability of the first success in a sequence of Bernoulli trials and is characterized by a single parameter, p, which is the probability of success on an individual trial. In contrast, the discrete Weibull distribution can model a broader range of data patterns due to its two parameters. While the geometric distribution is specifically for modeling the number of trials until the first success, the discrete Weibull can be used in a wider variety of scenarios, including those where the probability of success changes over trials.

Discrete Weibull vs. Negative Binomial Distribution: The negative binomial distribution is used to model the number of Bernoulli trials needed before a particular number of successes is achieved. It is characterized by the probability of success and the number of successes. The discrete Weibull distribution, with its flexibility in modeling different data patterns, can be a better fit for data that does not conform to the specific scenario modeled by the negative binomial distribution.

Overall the discrete Weibull distribution is preferred over these alternatives when dealing with data that exhibit variability in dispersion (over- or under-dispersion) or when the data patterns do not fit the specific scenarios that Poisson, geometric, or negative binomial distributions are best suited for. Its adaptability in terms of shape and scale makes it a versatile tool in statistical modeling of discrete data. [4]

Applications

The Discrete Weibull distribution finds diverse applications in statistical analysis, as evidenced by various scholarly papers. One such paper illustrates the distribution's utility in modeling count data, specifically in the context of fertility plans. This study highlights how the Discrete Weibull distribution effectively captures complex relationships influenced by factors like education and family background. Unlike the Poisson distribution, it adeptly manages both overdispersed and underdispersed data, demonstrating its flexibility and efficacy in social science research. This application marks a significant extension of the distribution's usage beyond its traditional role in reliability engineering. [5]

Further expanding its scope, "On Bivariate Discrete Weibull Distribution" explores the application of the Discrete Weibull distribution to bivariate data. The paper delves into sophisticated statistical techniques, including maximum likelihood estimation and Bayesian inference, for analyzing bivariate discrete data. This exploration underscores the distribution's compatibility with complex statistical methods. Moreover, the paper presents practical analysis scenarios, such as examining football match scores and nasal drainage severity, highlighting the distribution's broad applicability across varied fields. These instances underscore the distribution's practicality in real-world situations, moving beyond mere theoretical constructs. [6]

Another significant advancement is presented in "The Exponentiated Discrete Weibull Distribution," which introduces an enhanced version of the distribution, termed the Exponentiated Discrete Weibull Distribution (EDW). This generalization increases the model's flexibility, enabling it to represent a broader spectrum of data patterns, including various hazard rate functions like increasing, decreasing, bathtub-shaped, and inverted bathtub-shaped. The EDW distribution's ability to model both overdispersed and underdispersed data, relative to a Poisson distribution, broadens its applicability. It proves to be a versatile tool for various fields, including reliability engineering and failure time studies, further broadening the distribution's practical utility. [7]

See also

References

  1. Nakagawa, Toshio; Osaki, Shunji (1975). "The discrete Weibull distribution". IEEE Transactions on Reliability 24 (5): 300–301. doi:10.1109/TR.1975.5214915. 
  2. "Heavy-tailed sexual contact networks and monkeypox epidemiology in the global outbreak, 2022". Science 378 (6615): 90–94. 2022. doi:10.1126/science.add4507. PMID 36137054. 
  3. Scholz, Fritz (1996). "Maximum Likelihood Estimation for Type I Censored Weibull Data Including Covariates". ISSTECH-96-022, Boeing Information & Support Services. https://scholar.google.se/scholar?q=Maximum+Likelihood+Estimation+for+Type+I+Censored+Weibull+Data+Including+Covariates&hl=en&as_sdt=0&as_vis=1&oi=scholart&sa=X&ved=0ahUKEwj3iLHK7KzMAhXBORoKHQtqC5sQgQMIGzAA. Retrieved 26 April 2016. 
  4. PyMC Developers. (n.d.). PyMC3 3.11.5 Documentation: Discrete distributions. Retrieved from https://docs.pymc.io/en/v3/api/distributions/discrete.html
  5. Alina Peluso, Veronica Vinciotti, Keming Yu, Discrete Weibull Generalized Additive Model: An Application to Count Fertility Data, Journal of the Royal Statistical Society Series C: Applied Statistics, Volume 68, Issue 3, April 2019, Pages 565–583, https://doi.org/10.1111/rssc.12311
  6. Debasis Kundu & Vahid Nekoukhou (2019) On bivariate discrete Weibull distribution, Communications in Statistics - Theory and Methods, 48:14, 3464-3481, DOI: 10.1080/03610926.2018.1476712
  7. Nekoukhou, Vahid; Bidram, Hamid. “The exponentiated discrete Weibull Distribution”. SORT-Statistics and Operations Research Transactions, 2015, Vol. 39, Num. 1, pp. 127-146, https://raco.cat/index.php/SORT/article/view/294381.