Sub-Gaussian distribution

Short description: Continuous probability distribution

In probability theory, a sub-Gaussian distribution, the distribution of a sub-Gaussian random variable, is a probability distribution with strong tail decay. More specifically, the tails of a sub-Gaussian distribution are dominated by (i.e. decay at least as fast as) the tails of a Gaussian. This property gives sub-Gaussian distributions their name.

Formally, the probability distribution of a random variable [math]\displaystyle{ X }[/math] is called sub-Gaussian if there is a positive constant C such that for every [math]\displaystyle{ t \geq 0 }[/math],

[math]\displaystyle{ \operatorname{P}(|X| \geq t) \leq 2 \exp{(-t^2/C^2)} }[/math].

Alternatively, a random variable is considered sub-Gaussian if its distribution function is upper bounded (up to a constant) by the distribution function of a Gaussian. Specifically, we say that [math]\displaystyle{ X }[/math] is sub-Gaussian if for all [math]\displaystyle{ s \geq 0 }[/math] we have that:

[math]\displaystyle{ P(|X| \geq s) \leq cP(|Z| \geq s), }[/math]

where [math]\displaystyle{ c\ge 0 }[/math] is constant and [math]\displaystyle{ Z }[/math] is a mean zero Gaussian random variable.^[1]

Definitions

The sub-Gaussian norm of [math]\displaystyle{ X }[/math], denoted as [math]\displaystyle{ \Vert X \Vert_{gauss} }[/math], is defined by[math]\displaystyle{ \Vert X \Vert_{gauss} = \inf\left\{ c\gt 0 : \operatorname{E}\left[\exp{\left(\frac{X^2}{c^2}\right)}\right] \leq 2 \right\}, }[/math]which is the Orlicz norm of [math]\displaystyle{ X }[/math] generated by the Orlicz function [math]\displaystyle{ \Phi(u)=e^{u^2}-1. }[/math] By condition [math]\displaystyle{ (2) }[/math] below, sub-Gaussian random variables can be characterized as those random variables with finite sub-Gaussian norm.

Sub-Gaussian properties

Let [math]\displaystyle{ X }[/math] be a random variable. The following conditions are equivalent:

[math]\displaystyle{ \operatorname{P}(|X| \geq t) \leq 2 \exp{(-t^2/K_1^2)} }[/math] for all [math]\displaystyle{ t \geq 0 }[/math], where [math]\displaystyle{ K_1 }[/math] is a positive constant;
[math]\displaystyle{ \operatorname{E}[\exp{(X^2/K_2^2)}] \leq 2 }[/math], where [math]\displaystyle{ K_2 }[/math] is a positive constant;
[math]\displaystyle{ \operatorname{E} |X|^p \leq 2K_3^p \Gamma\left(\frac{p}{2}+1\right) }[/math] for all [math]\displaystyle{ p \geq 1 }[/math], where [math]\displaystyle{ K_3 }[/math] is a positive constant.

Proof. [math]\displaystyle{ (1)\implies(3) }[/math] By the layer cake representation,[math]\displaystyle{ \begin{align} \operatorname{E} |X|^p &= \int_0^\infty \operatorname{P}(|X|^p \geq t) dt\\ &= \int_0^\infty pt^{p-1}\operatorname{P}(|X| \geq t) dt\\ &\leq 2\int_0^\infty pt^{p-1}\exp\left(-\frac{t^2}{K_1^2}\right) dt\\ \end{align} }[/math]After a change of variables [math]\displaystyle{ u=t^2/K_1^2 }[/math], we find that[math]\displaystyle{ \begin{align} \operatorname{E} |X|^p &\leq 2K_1^p \frac{p}{2}\int_0^\infty u^{\frac{p}{2}-1}e^{-u} du\\ &= 2K_1^p \frac{p}{2}\Gamma\left(\frac{p}{2}\right)\\ &= 2K_1^p \Gamma\left(\frac{p}{2}+1\right). \end{align} }[/math]

[math]\displaystyle{ (3)\implies(2) }[/math] Using the Taylor series for [math]\displaystyle{ e^x }[/math]:[math]\displaystyle{ e^x = 1 + \sum_{p=1}^\infty \frac{x^p}{p!}, }[/math] we obtain that[math]\displaystyle{ \begin{align} \operatorname{E}[\exp{(\lambda X^2)}] &= 1 + \sum_{p=1}^\infty \frac{\lambda^p \operatorname{E}{[X^{2p}]}}{p!}\\ &\leq 1 + \sum_{p=1}^\infty \frac{2\lambda^p K_3^{2p} \Gamma\left(p+1\right)}{p!}\\ &= 1 + 2 \sum_{p=1}^\infty \lambda^p K_3^{2p}\\ &= 2 \sum_{p=0}^\infty \lambda^p K_3^{2p}-1\\ &= \frac{2}{1-\lambda K_3^2}-1 \quad\text{for }\lambda K_3^2 \lt 1, \end{align} }[/math]which is less than or equal to [math]\displaystyle{ 2 }[/math] for [math]\displaystyle{ \lambda \leq \frac{1}{3K_3^2} }[/math]. Take [math]\displaystyle{ K_2 \geq 3^{\frac{1}{2}}K_3 }[/math], then[math]\displaystyle{ \operatorname{E}[\exp{(X^2/K_2^2)}] \leq 2. }[/math]

[math]\displaystyle{ (2)\implies(1) }[/math] By Markov's inequality,[math]\displaystyle{ \operatorname{P}(|X|\geq t) = \operatorname{P}\left( \exp\left(\frac{X^2}{K_2^2}\right) \geq \exp\left(\frac{t^2}{K_2^2}\right) \right) \leq \frac{\operatorname{E}[\exp{(X^2/K_2^2)}]}{\exp\left(\frac{t^2}{K_2^2}\right)} \leq 2 \exp\left(-\frac{t^2}{K_2^2}\right). }[/math]

More equivalent definitions

The following properties are equivalent:

The distribution of [math]\displaystyle{ X }[/math] is sub-Gaussian.
Laplace transform condition: for some B, b > 0, [math]\displaystyle{ \operatorname{E} e^{\lambda (X-\operatorname{E}[X])} \leq Be^{\lambda^2 b} }[/math] holds for all [math]\displaystyle{ \lambda }[/math].
Moment condition: for some K > 0, [math]\displaystyle{ \operatorname{E} |X|^p \leq K^p p^{p/2} }[/math] for all [math]\displaystyle{ p \geq 1 }[/math].
Moment generating function condition: for some [math]\displaystyle{ L\gt 0 }[/math], [math]\displaystyle{ \operatorname{E}[\exp(\lambda^2 X^2)] \leq \exp(L^2\lambda^2) }[/math] for all [math]\displaystyle{ \lambda }[/math] such that [math]\displaystyle{ |\lambda| \leq \frac{1}{L} }[/math]. ^[2]
Union bound condition: for some c > 0, [math]\displaystyle{ \ \operatorname{E}[\max\{|X_1 - \operatorname{E}[X]|,\ldots,|X_n - \operatorname{E}[X]|\}] \leq c \sqrt{\log n} }[/math] for all n > c, where [math]\displaystyle{ X_1, \ldots, X_n }[/math] are i.i.d copies of X.

Examples

A standard normal random variable [math]\displaystyle{ X\sim N(0,1) }[/math] is a sub-Gaussian random variable.

Let [math]\displaystyle{ X }[/math] be a random variable with symmetric Bernoulli distribution (or Rademacher distribution). That is, [math]\displaystyle{ X }[/math] takes values [math]\displaystyle{ -1 }[/math] and [math]\displaystyle{ 1 }[/math] with probabilities [math]\displaystyle{ 1/2 }[/math] each. Since [math]\displaystyle{ X^2=1 }[/math], it follows that [math]\displaystyle{ \Vert X \Vert_{gauss} = \inf\left\{ c\gt 0 : \operatorname{E}\left[\exp{\left(\frac{X^2}{c^2}\right)}\right] \leq 2 \right\} = \inf\left\{ c\gt 0 : \operatorname{E}\left[\exp{\left(\frac{1}{c^2}\right)}\right] \leq 2 \right\}=\frac{1}{\sqrt{\ln 2}}, }[/math]and hence [math]\displaystyle{ X }[/math] is a sub-Gaussian random variable.

Maximum of Sub-Gaussian Random Variables

Consider a finite collection of subgaussian random variables, X₁, ..., X_n, with corresponding subgaussian parameters [math]\displaystyle{ \sigma }[/math]. The random variable M_n = max(X₁, ..., X_n) represents the maximum of this collection. The expectation [math]\displaystyle{ M_n }[/math] can be bounded above by [math]\displaystyle{ \sqrt{2\sigma^2\log n} }[/math]. Note that no independence assumption is needed to form this bound.^[1]

Notes

↑ ^{Jump up to: 1.0} ^1.1 Wainwright MJ. High-Dimensional Statistics: A Non-Asymptotic Viewpoint. Cambridge: Cambridge University Press; 2019. doi:10.1017/9781108627771, ISBN:9781108627771.
↑ Vershynin, R. (2018). High-dimensional probability: An introduction with applications in data science. Cambridge: Cambridge University Press. pp. 33–34.

References

Kahane, J.P. (1960). "Propriétés locales des fonctions à séries de Fourier aléatoires". Studia Mathematica 19: 1–25. doi:10.4064/sm-19-1-1-25.
Buldygin, V.V.; Kozachenko, Yu.V. (1980). "Sub-Gaussian random variables". Ukrainian Mathematical Journal 32 (6): 483–489. doi:10.1007/BF01087176.
Ledoux, Michel; Talagrand, Michel (1991). Probability in Banach Spaces. Springer-Verlag.
Stromberg, K.R. (1994). Probability for Analysts. Chapman & Hall/CRC.
Litvak, A.E.; Pajor, A.; Rudelson, M.; Tomczak-Jaegermann, N. (2005). "Smallest singular value of random matrices and geometry of random polytopes". Advances in Mathematics 195 (2): 491–523. doi:10.1016/j.aim.2004.08.004. http://www.math.ualberta.ca/~alexandr/OrganizedPapers/lprtlastlast.pdf.
Rudelson, Mark; Vershynin, Roman (2010). "Non-asymptotic theory of random matrices: extreme singular values". pp. 1576–1602. doi:10.1142/9789814324359_0111.
Rivasplata, O. (2012). "Subgaussian random variables: An expository note". Unpublished. http://www.stat.cmu.edu/~arinaldo/36788/subgaussians.pdf.
Vershynin, R. (2018). "High-dimensional probability: An introduction with applications in data science" (PDF). Volume 47 of Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, Cambridge.
Zajkowskim, K. (2020). "On norms in some class of exponential type Orlicz spaces of random variables". Positivity. An International Mathematics Journal Devoted to Theory and Applications of Positivity. 24(5): 1231--1240. arXiv:1709.02970. doi.org/10.1007/s11117-019-00729-6.

0.00

(0 votes)

Original source: https://en.wikipedia.org/wiki/Sub-Gaussian distribution. Read more

[Wainwright2019-1] {Jump up to: 1.0} ^1.1 Wainwright MJ. High-Dimensional Statistics: A Non-Asymptotic Viewpoint. Cambridge: Cambridge University Press; 2019. doi:10.1017/9781108627771, ISBN:9781108627771.

[:0-2] Vershynin, R. (2018). High-dimensional probability: An introduction with applications in data science. Cambridge: Cambridge University Press. pp. 33–34.

[1]

[2]

Anonymous

Search

Sub-Gaussian distribution

Namespaces

More

Page actions

Contents

Definitions

Sub-Gaussian properties

More equivalent definitions

Examples

Maximum of Sub-Gaussian Random Variables

See also

Notes

References

Navigation

Navigation

Help

Translate

Wiki tools

Wiki tools

Anonymous

Search

Sub-Gaussian distribution

Definitions

Sub-Gaussian properties

More equivalent definitions

Examples

Maximum of Sub-Gaussian Random Variables

See also

Notes

References

Navigation

Wiki tools

Page tools

Other projects

Categories