Edgeworth series
The Gram–Charlier A series (named in honor of Jørgen Pedersen Gram and Carl Charlier), and the Edgeworth series (named in honor of Francis Ysidro Edgeworth) are series that approximate a probability distribution in terms of its cumulants.[1] The series are the same; but, the arrangement of terms (and thus the accuracy of truncating the series) differ.[2] The key idea of these expansions is to write the characteristic function of the distribution whose probability density function f is to be approximated in terms of the characteristic function of a distribution with known and suitable properties, and to recover f through the inverse Fourier transform.
Gram–Charlier A series
We examine a continuous random variable. Let [math]\displaystyle{ \hat{f} }[/math] be the characteristic function of its distribution whose density function is f, and [math]\displaystyle{ \kappa_r }[/math] its cumulants. We expand in terms of a known distribution with probability density function ψ, characteristic function [math]\displaystyle{ \hat{\psi} }[/math], and cumulants [math]\displaystyle{ \gamma_r }[/math]. The density ψ is generally chosen to be that of the normal distribution, but other choices are possible as well. By the definition of the cumulants, we have (see Wallace, 1958)[3]
- [math]\displaystyle{ \hat{f}(t)= \exp\left[\sum_{r=1}^\infty\kappa_r\frac{(it)^r}{r!}\right] }[/math] and
- [math]\displaystyle{ \hat{\psi}(t)=\exp\left[\sum_{r=1}^\infty\gamma_r\frac{(it)^r}{r!}\right], }[/math]
which gives the following formal identity:
- [math]\displaystyle{ \hat{f}(t)=\exp\left[\sum_{r=1}^\infty(\kappa_r-\gamma_r)\frac{(it)^r}{r!}\right]\hat{\psi}(t)\,. }[/math]
By the properties of the Fourier transform, [math]\displaystyle{ (it)^r \hat{\psi}(t) }[/math] is the Fourier transform of [math]\displaystyle{ (-1)^r[D^r\psi](-x) }[/math], where D is the differential operator with respect to x. Thus, after changing [math]\displaystyle{ x }[/math] with [math]\displaystyle{ -x }[/math] on both sides of the equation, we find for f the formal expansion
- [math]\displaystyle{ f(x) = \exp\left[\sum_{r=1}^\infty(\kappa_r - \gamma_r)\frac{(-D)^r}{r!}\right]\psi(x)\,. }[/math]
If ψ is chosen as the normal density
- [math]\displaystyle{ \phi(x) = \frac{1}{\sqrt{2\pi}\sigma}\exp\left[-\frac{(x-\mu)^2}{2\sigma^2}\right] }[/math]
with mean and variance as given by f, that is, mean [math]\displaystyle{ \mu = \kappa_1 }[/math] and variance [math]\displaystyle{ \sigma^2 = \kappa_2 }[/math], then the expansion becomes
- [math]\displaystyle{ f(x) = \exp\left[\sum_{r=3}^\infty\kappa_r\frac{(-D)^r}{r!}\right] \phi(x), }[/math]
since [math]\displaystyle{ \gamma_r=0 }[/math] for all r > 2, as higher cumulants of the normal distribution are 0. By expanding the exponential and collecting terms according to the order of the derivatives, we arrive at the Gram–Charlier A series. Such an expansion can be written compactly in terms of Bell polynomials as
- [math]\displaystyle{ \exp\left[\sum_{r=3}^\infty\kappa_r\frac{(-D)^r}{r!}\right] = \sum_{n=0}^\infty B_n(0,0,\kappa_3,\ldots,\kappa_n)\frac{(-D)^n}{n!}. }[/math]
Since the n-th derivative of the Gaussian function [math]\displaystyle{ \phi }[/math] is given in terms of Hermite polynomial as
- [math]\displaystyle{ \phi^{(n)}(x) = \frac{(-1)^n}{\sigma^n} He_n \left( \frac{x-\mu}{\sigma} \right) \phi(x), }[/math]
this gives us the final expression of the Gram–Charlier A series as
- [math]\displaystyle{ f(x) = \phi(x) \sum_{n=0}^\infty \frac{1}{n! \sigma^n} B_n(0,0,\kappa_3,\ldots,\kappa_n) He_n \left( \frac{x-\mu}{\sigma} \right). }[/math]
Integrating the series gives us the cumulative distribution function
- [math]\displaystyle{ F(x) = \int_{-\infty}^x f(u) du = \Phi(x) - \phi(x) \sum_{n=3}^\infty \frac{1}{n! \sigma^{n-1}} B_n(0,0,\kappa_3,\ldots,\kappa_n) He_{n-1} \left( \frac{x-\mu}{\sigma} \right), }[/math]
where [math]\displaystyle{ \Phi }[/math] is the CDF of the normal distribution.
If we include only the first two correction terms to the normal distribution, we obtain
- [math]\displaystyle{ f(x) \approx \frac{1}{\sqrt{2\pi}\sigma}\exp\left[-\frac{(x-\mu)^2}{2\sigma^2}\right]\left[1+\frac{\kappa_3}{3!\sigma^3}He_3\left(\frac{x-\mu}{\sigma}\right)+\frac{\kappa_4}{4!\sigma^4}He_4\left(\frac{x-\mu}{\sigma}\right)\right]\,, }[/math]
with [math]\displaystyle{ He_3(x)=x^3-3x }[/math] and [math]\displaystyle{ He_4(x)=x^4 - 6x^2 + 3 }[/math].
Note that this expression is not guaranteed to be positive, and is therefore not a valid probability distribution. The Gram–Charlier A series diverges in many cases of interest—it converges only if [math]\displaystyle{ f(x) }[/math] falls off faster than [math]\displaystyle{ \exp(-(x^2)/4) }[/math] at infinity (Cramér 1957). When it does not converge, the series is also not a true asymptotic expansion, because it is not possible to estimate the error of the expansion. For this reason, the Edgeworth series (see next section) is generally preferred over the Gram–Charlier A series.
The Edgeworth series
Edgeworth developed a similar expansion as an improvement to the central limit theorem.[4] The advantage of the Edgeworth series is that the error is controlled, so that it is a true asymptotic expansion.
Let [math]\displaystyle{ \{Z_i\} }[/math] be a sequence of independent and identically distributed random variables with finite mean [math]\displaystyle{ \mu }[/math] and variance [math]\displaystyle{ \sigma^2 }[/math], and let [math]\displaystyle{ X_n }[/math] be their standardized sums:
- [math]\displaystyle{ X_n = \frac{1}{\sqrt{n}} \sum_{i=1}^n \frac{Z_i - \mu}{\sigma}. }[/math]
Let [math]\displaystyle{ F_n }[/math] denote the cumulative distribution functions of the variables [math]\displaystyle{ X_n }[/math]. Then by the central limit theorem,
- [math]\displaystyle{ \lim_{n\to\infty} F_n(x) = \Phi(x) \equiv \int_{-\infty}^x \tfrac{1}{\sqrt{2\pi}}e^{-\frac{1}{2}q^2}dq }[/math]
for every [math]\displaystyle{ x }[/math], as long as the mean and variance are finite.
The standardization of [math]\displaystyle{ \{Z_i\} }[/math] ensures that the first two cumulants of [math]\displaystyle{ X_n }[/math] are [math]\displaystyle{ \kappa_1^{F_n} = 0 }[/math] and [math]\displaystyle{ \kappa_2^{F_n} = 1. }[/math] Now assume that, in addition to having mean [math]\displaystyle{ \mu }[/math] and variance [math]\displaystyle{ \sigma^2 }[/math], the i.i.d. random variables [math]\displaystyle{ Z_i }[/math] have higher cumulants [math]\displaystyle{ \kappa_r }[/math]. From the additivity and homogeneity properties of cumulants, the cumulants of [math]\displaystyle{ X_n }[/math] in terms of the cumulants of [math]\displaystyle{ Z_i }[/math] are for [math]\displaystyle{ r \geq 2 }[/math],
- [math]\displaystyle{ \kappa_r^{F_n} = \frac{n \kappa_r}{\sigma^r n^{r/2}} = \frac{\lambda_r}{n^{r/2 - 1}} \quad \mathrm{where} \quad \lambda_r = \frac{\kappa_r}{\sigma^r}. }[/math]
If we expand the formal expression of the characteristic function [math]\displaystyle{ \hat{f}_n(t) }[/math] of [math]\displaystyle{ F_n }[/math] in terms of the standard normal distribution, that is, if we set
- [math]\displaystyle{ \phi(x)=\frac{1}{\sqrt{2\pi}}\exp(-\tfrac{1}{2}x^2), }[/math]
then the cumulant differences in the expansion are
- [math]\displaystyle{ \kappa^{F_n}_1-\gamma_1 = 0, }[/math]
- [math]\displaystyle{ \kappa^{F_n}_2-\gamma_2 = 0, }[/math]
- [math]\displaystyle{ \kappa^{F_n}_r-\gamma_r = \frac{\lambda_r}{n^{r/2-1}}; \qquad r\geq 3. }[/math]
The Gram–Charlier A series for the density function of [math]\displaystyle{ X_n }[/math] is now
- [math]\displaystyle{ f_n(x) = \phi(x) \sum_{r=0}^\infty \frac{1}{r!} B_r \left(0,0,\frac{\lambda_3}{n^{1/2}},\ldots,\frac{\lambda_r}{n^{r/2-1}}\right) He_r(x). }[/math]
The Edgeworth series is developed similarly to the Gram–Charlier A series, only that now terms are collected according to powers of [math]\displaystyle{ n }[/math]. The coefficients of n−m/2 term can be obtained by collecting the monomials of the Bell polynomials corresponding to the integer partitions of m. Thus, we have the characteristic function as
- [math]\displaystyle{ \hat{f}_n(t)=\left[1+\sum_{j=1}^\infty \frac{P_j(it)}{n^{j/2}}\right] \exp(-t^2/2)\,, }[/math]
where [math]\displaystyle{ P_j(x) }[/math] is a polynomial of degree [math]\displaystyle{ 3j }[/math]. Again, after inverse Fourier transform, the density function [math]\displaystyle{ f_n }[/math] follows as
- [math]\displaystyle{ f_n(x) = \phi(x) + \sum_{j=1}^\infty \frac{P_j(-D)}{n^{j/2}} \phi(x)\,. }[/math]
Likewise, integrating the series, we obtain the distribution function
- [math]\displaystyle{ F_n(x) = \Phi(x) + \sum_{j=1}^\infty \frac{1}{n^{j/2}} \frac{P_j(-D)}{D} \phi(x)\,. }[/math]
We can explicitly write the polynomial [math]\displaystyle{ P_m(-D) }[/math] as
- [math]\displaystyle{ P_m(-D) = \sum \prod_i \frac{1}{k_i!} \left(\frac{\lambda_{l_i}}{l_i!}\right)^{k_i} (-D)^s, }[/math]
where the summation is over all the integer partitions of m such that [math]\displaystyle{ \sum_i i k_i = m }[/math] and [math]\displaystyle{ l_i = i+2 }[/math] and [math]\displaystyle{ s = \sum_i k_i l_i. }[/math]
For example, if m = 3, then there are three ways to partition this number: 1 + 1 + 1 = 2 + 1 = 3. As such we need to examine three cases:
- 1 + 1 + 1 = 1 · k1, so we have k1 = 3, l1 = 3, and s = 9.
- 1 + 2 = 1 · k1 + 2 · k2, so we have k1 = 1, k2 = 1, l1 = 3, l2 = 4, and s = 7.
- 3 = 3 · k3, so we have k3 = 1, l3 = 5, and s = 5.
Thus, the required polynomial is
- [math]\displaystyle{ \begin{align} P_3(-D) &= \frac{1}{3!} \left(\frac{\lambda_3}{3!}\right)^3 (-D)^9 + \frac{1}{1! 1!} \left(\frac{\lambda_3}{3!}\right) \left(\frac{\lambda_4}{4!}\right) (-D)^7 + \frac{1}{1!} \left(\frac{\lambda_5}{5!}\right) (-D)^5 \\ &= \frac{\lambda_3^3}{1296} (-D)^9 + \frac{\lambda_3 \lambda_4}{144} (-D)^7 + \frac{\lambda_5}{120} (-D)^5. \end{align} }[/math]
The first five terms of the expansion are[5]
- [math]\displaystyle{ \begin{align} f_n(x) &= \phi(x) \\ &\quad -\frac{1}{n^{\frac{1}{2}}}\left(\tfrac{1}{6}\lambda_3\,\phi^{(3)}(x) \right) \\ &\quad +\frac{1}{n}\left(\tfrac{1}{24}\lambda_4\,\phi^{(4)}(x) + \tfrac{1}{72}\lambda_3^2\,\phi^{(6)}(x) \right) \\ &\quad -\frac{1}{n^{\frac{3}{2}}}\left(\tfrac{1}{120}\lambda_5\,\phi^{(5)}(x) + \tfrac{1}{144}\lambda_3\lambda_4\,\phi^{(7)}(x) + \tfrac{1}{1296}\lambda_3^3\,\phi^{(9)}(x)\right) \\ &\quad + \frac{1}{n^2}\left(\tfrac{1}{720}\lambda_6\,\phi^{(6)}(x) + \left(\tfrac{1}{1152}\lambda_4^2 + \tfrac{1}{720}\lambda_3\lambda_5\right)\phi^{(8)}(x) + \tfrac{1}{1728}\lambda_3^2\lambda_4\,\phi^{(10)}(x) + \tfrac{1}{31104}\lambda_3^4\,\phi^{(12)}(x) \right)\\ &\quad + O \left (n^{-\frac{5}{2}} \right ). \end{align} }[/math]
Here, φ(j)(x) is the j-th derivative of φ(·) at point x. Remembering that the derivatives of the density of the normal distribution are related to the normal density by [math]\displaystyle{ \phi^{(n)}(x) = (-1)^n He_n(x)\phi(x) }[/math], (where [math]\displaystyle{ He_n }[/math] is the Hermite polynomial of order n), this explains the alternative representations in terms of the density function. Blinnikov and Moessner (1998) have given a simple algorithm to calculate higher-order terms of the expansion.
Note that in case of a lattice distributions (which have discrete values), the Edgeworth expansion must be adjusted to account for the discontinuous jumps between lattice points.[6]
Illustration: density of the sample mean of three χ² distributions
Take [math]\displaystyle{ X_i \sim \chi^2(k=2), \, i=1, 2, 3 \, (n=3) }[/math] and the sample mean [math]\displaystyle{ \bar X = \frac{1}{3} \sum_{i=1}^{3} X_i }[/math].
We can use several distributions for [math]\displaystyle{ \bar X }[/math]:
- The exact distribution, which follows a gamma distribution: [math]\displaystyle{ \bar X \sim \mathrm{Gamma}\left(\alpha=n\cdot k /2, \theta= 2/n \right)=\mathrm{Gamma}\left(\alpha=3, \theta= 2/3 \right) }[/math].
- The asymptotic normal distribution: [math]\displaystyle{ \bar X \xrightarrow{n \to \infty} N(k, 2\cdot k /n ) = N(2, 4/3 ) }[/math].
- Two Edgeworth expansions, of degrees 2 and 3.
Discussion of results
- For finite samples, an Edgeworth expansion is not guaranteed to be a proper probability distribution as the CDF values at some points may go beyond [math]\displaystyle{ [0,1] }[/math].
- They guarantee (asymptotically) absolute errors, but relative errors can be easily assessed by comparing the leading Edgeworth term in the remainder with the overall leading term.[2]
See also
References
- ↑ Stuart, A., & Kendall, M. G. (1968). The advanced theory of statistics. Hafner Publishing Company.
- ↑ 2.0 2.1 Kolassa, John E. (2006). Series approximation methods in statistics (3rd ed.). Springer. ISBN 0387322272.
- ↑ Wallace, D. L. (1958). "Asymptotic Approximations to Distributions". Annals of Mathematical Statistics 29 (3): 635–654. doi:10.1214/aoms/1177706528.
- ↑ Hall, P. (2013). The bootstrap and Edgeworth expansion. Springer Science & Business Media.
- ↑ Weisstein, Eric W.. "Edgeworth Series". http://mathworld.wolfram.com/EdgeworthSeries.html.
- ↑ Kolassa, John E.; McCullagh, Peter (1990). "Edgeworth series for lattice distributions". Annals of Statistics 18 (2): 981–985. doi:10.1214/aos/1176347637.
Further reading
- H. Cramér. (1957). Mathematical Methods of Statistics. Princeton University Press, Princeton.
- Wallace, D. L. (1958). "Asymptotic approximations to distributions". Annals of Mathematical Statistics 29 (3): 635–654. doi:10.1214/aoms/1177706528.
- M. Kendall & A. Stuart. (1977), The advanced theory of statistics, Vol 1: Distribution theory, 4th Edition, Macmillan, New York.
- P. McCullagh (1987). Tensor Methods in Statistics. Chapman and Hall, London.
- D. R. Cox and O. E. Barndorff-Nielsen (1989). Asymptotic Techniques for Use in Statistics. Chapman and Hall, London.
- P. Hall (1992). The Bootstrap and Edgeworth Expansion. Springer, New York.
- Hazewinkel, Michiel, ed. (2001), "Edgeworth series", Encyclopedia of Mathematics, Springer Science+Business Media B.V. / Kluwer Academic Publishers, ISBN 978-1-55608-010-4, https://www.encyclopediaofmath.org/index.php?title=p/e035060
- Blinnikov, S.; Moessner, R. (1998). "Expansions for nearly Gaussian distributions". Astronomy and Astrophysics Supplement Series 130: 193–205. doi:10.1051/aas:1998221. Bibcode: 1998A&AS..130..193B. http://aas.aanda.org/articles/aas/pdf/1998/10/h0596.pdf.
- Martin, Douglas; Arora, Rohit (2017). "Inefficiency and bias of modified value-at-risk and expected shortfall". Journal of Risk 19 (6): 59–84. doi:10.21314/JOR.2017.365.
- J. E. Kolassa (2006). Series Approximation Methods in Statistics (3rd ed.). (Lecture Notes in Statistics #88). Springer, New York.
Original source: https://en.wikipedia.org/wiki/Edgeworth series.
Read more |