Bennett's inequality

From HandWiki

In probability theory, Bennett's inequality provides an upper bound on the probability that the sum of independent random variables deviates from its expected value by more than any specified amount. Bennett's inequality was proved by George Bennett of the University of New South Wales in 1962.[1]

Statement

Let X1, … Xn be independent random variables with finite variance. Further assume Xia almost surely for all i, and define [math]\displaystyle{ S_n = \sum_{i = 1}^n \left[X_i - \operatorname{E}(X_i)\right] }[/math] and [math]\displaystyle{ \sigma^2 = \sum_{i=1}^n \operatorname{E}(X_i-\operatorname{E} X_i)^2. }[/math] Then for any t ≥ 0,

[math]\displaystyle{ \Pr\left( S_n \gt t \right) \leq \exp\left( - \frac{\sigma^2}{a^2} h\left(\frac{at}{\sigma^2} \right)\right), }[/math]

where h(u) = (1 + u)log(1 + u) – u and log denotes the natural logarithm.[2][3]

Generalizations and comparisons to other bounds

For generalizations see Freedman (1975)[4] and Fan, Grama and Liu (2012)[5] for a martingale version of Bennett's inequality and its improvement, respectively.

Hoeffding's inequality only assumes the summands are bounded almost surely, while Bennett's inequality offers some improvement when the variances of the summands are small compared to their almost sure bounds. However Hoeffding's inequality entails sub-Gaussian tails, whereas in general Bennett's inequality has Poissonian tails.[citation needed]

Bennett's inequality is most similar to the Bernstein inequalities, the first of which also gives concentration in terms of the variance and almost sure bound on the individual terms. Bennett's inequality is stronger than this bound, but more complicated to compute.[3]

In both inequalities, unlike some other inequalities or limit theorems, there is no requirement that the component variables have identical or similar distributions.[citation needed]

Example

Suppose that each Xi is an independent binary random variable with probability p. Then Bennett's inequality says that:

[math]\displaystyle{ \Pr\left( \sum_{i = 1}^n X_i \gt pn + t \right) \leq \exp\left( - np h\left(\frac{t}{np}\right)\right). }[/math]

For [math]\displaystyle{ t \geq 10 np }[/math], [math]\displaystyle{ h(\frac{t}{np}) \geq \frac{t}{2np} \log \frac{t}{np}, }[/math] so

[math]\displaystyle{ \Pr\left( \sum_{i = 1}^n X_i \gt pn + t \right) \leq \left(\frac{t}{np}\right)^{-t/2} }[/math]

for [math]\displaystyle{ t \geq 10 np }[/math].

By contrast, Hoeffding's inequality gives a bound of [math]\displaystyle{ \exp(-2 t^2/n) }[/math] and the first Bernstein inequality gives a bound of [math]\displaystyle{ \exp(-\frac{t^2}{2np + 2t/3}) }[/math]. For [math]\displaystyle{ t \gg np }[/math], Hoeffding's inequality gives [math]\displaystyle{ \exp(-\Theta(t^2/n)) }[/math], Bernstein gives [math]\displaystyle{ \exp(-\Theta(t)) }[/math], and Bennett gives [math]\displaystyle{ \exp(-\Theta(t \log \frac{t}{np})) }[/math].

See also

References

  1. Bennett, G. (1962). "Probability Inequalities for the Sum of Independent Random Variables". Journal of the American Statistical Association 57 (297): 33–45. doi:10.2307/2282438. 
  2. Devroye, Luc; Lugosi, Gábor (2001). Combinatorial methods in density estimation. Springer. p. 11. ISBN 978-0-387-95117-1. https://books.google.com/books?id=jvT-sUt1HZYC&pg=PA11. 
  3. 3.0 3.1 Boucheron, Stephane; Lugosi, Gabor; Massart, Pascal (2013). Concentration inequalities, a nonasymptotic theory of independence. Oxford University Press. ISBN 978-0-19-953525-5. 
  4. Freedman, D. A. (1975). "On tail probabilities for martingales.". The Annals of Probability 3 (1): 100–118. doi:10.1214/aop/1176996452. 
  5. Fan, X.; Grama, I.; Liu, Q. (2012). "Hoeffding's inequality for supermartingales". Stochastic Processes and Their Applications 122 (10): 3545–3559. doi:10.1016/j.spa.2012.06.009.