Sub-Gaussian distribution
In probability theory, a sub-Gaussian distribution, the distribution of a sub-Gaussian random variable, is a probability distribution with strong tail decay. More specifically, the tails of a sub-Gaussian distribution are dominated by (i.e. decay at least as fast as) the tails of a Gaussian. This property gives sub-Gaussian distributions their name.
Formally, the probability distribution of a random variable [math]\displaystyle{ X }[/math] is called sub-Gaussian if there is a positive constant C such that for every [math]\displaystyle{ t \geq 0 }[/math],
- [math]\displaystyle{ \operatorname{P}(|X| \geq t) \leq 2 \exp{(-t^2/C^2)} }[/math].
Alternatively, a random variable is considered sub-Gaussian if its distribution function is upper bounded (up to a constant) by the distribution function of a Gaussian. Specifically, we say that [math]\displaystyle{ X }[/math] is sub-Gaussian if for all [math]\displaystyle{ s \geq 0 }[/math] we have that:
- [math]\displaystyle{ P(|X| \geq s) \leq cP(|Z| \geq s), }[/math]
where [math]\displaystyle{ c\ge 0 }[/math] is constant and [math]\displaystyle{ Z }[/math] is a mean zero Gaussian random variable.[1]
Definitions
The sub-Gaussian norm of [math]\displaystyle{ X }[/math], denoted as [math]\displaystyle{ \Vert X \Vert_{gauss} }[/math], is defined by[math]\displaystyle{ \Vert X \Vert_{gauss} = \inf\left\{ c\gt 0 : \operatorname{E}\left[\exp{\left(\frac{X^2}{c^2}\right)}\right] \leq 2 \right\}, }[/math]which is the Orlicz norm of [math]\displaystyle{ X }[/math] generated by the Orlicz function [math]\displaystyle{ \Phi(u)=e^{u^2}-1. }[/math] By condition [math]\displaystyle{ (2) }[/math] below, sub-Gaussian random variables can be characterized as those random variables with finite sub-Gaussian norm.
Sub-Gaussian properties
Let [math]\displaystyle{ X }[/math] be a random variable. The following conditions are equivalent:
- [math]\displaystyle{ \operatorname{P}(|X| \geq t) \leq 2 \exp{(-t^2/K_1^2)} }[/math] for all [math]\displaystyle{ t \geq 0 }[/math], where [math]\displaystyle{ K_1 }[/math] is a positive constant;
- [math]\displaystyle{ \operatorname{E}[\exp{(X^2/K_2^2)}] \leq 2 }[/math], where [math]\displaystyle{ K_2 }[/math] is a positive constant;
- [math]\displaystyle{ \operatorname{E} |X|^p \leq 2K_3^p \Gamma\left(\frac{p}{2}+1\right) }[/math] for all [math]\displaystyle{ p \geq 1 }[/math], where [math]\displaystyle{ K_3 }[/math] is a positive constant.
Proof. [math]\displaystyle{ (1)\implies(3) }[/math] By the layer cake representation,[math]\displaystyle{ \begin{align} \operatorname{E} |X|^p &= \int_0^\infty \operatorname{P}(|X|^p \geq t) dt\\ &= \int_0^\infty pt^{p-1}\operatorname{P}(|X| \geq t) dt\\ &\leq 2\int_0^\infty pt^{p-1}\exp\left(-\frac{t^2}{K_1^2}\right) dt\\ \end{align} }[/math]After a change of variables [math]\displaystyle{ u=t^2/K_1^2 }[/math], we find that[math]\displaystyle{ \begin{align} \operatorname{E} |X|^p &\leq 2K_1^p \frac{p}{2}\int_0^\infty u^{\frac{p}{2}-1}e^{-u} du\\ &= 2K_1^p \frac{p}{2}\Gamma\left(\frac{p}{2}\right)\\ &= 2K_1^p \Gamma\left(\frac{p}{2}+1\right). \end{align} }[/math]
[math]\displaystyle{ (3)\implies(2) }[/math] Using the Taylor series for [math]\displaystyle{ e^x }[/math]:[math]\displaystyle{ e^x = 1 + \sum_{p=1}^\infty \frac{x^p}{p!}, }[/math] we obtain that[math]\displaystyle{ \begin{align} \operatorname{E}[\exp{(\lambda X^2)}] &= 1 + \sum_{p=1}^\infty \frac{\lambda^p \operatorname{E}{[X^{2p}]}}{p!}\\ &\leq 1 + \sum_{p=1}^\infty \frac{2\lambda^p K_3^{2p} \Gamma\left(p+1\right)}{p!}\\ &= 1 + 2 \sum_{p=1}^\infty \lambda^p K_3^{2p}\\ &= 2 \sum_{p=0}^\infty \lambda^p K_3^{2p}-1\\ &= \frac{2}{1-\lambda K_3^2}-1 \quad\text{for }\lambda K_3^2 \lt 1, \end{align} }[/math]which is less than or equal to [math]\displaystyle{ 2 }[/math] for [math]\displaystyle{ \lambda \leq \frac{1}{3K_3^2} }[/math]. Take [math]\displaystyle{ K_2 \geq 3^{\frac{1}{2}}K_3 }[/math], then[math]\displaystyle{ \operatorname{E}[\exp{(X^2/K_2^2)}] \leq 2. }[/math]
[math]\displaystyle{ (2)\implies(1) }[/math] By Markov's inequality,[math]\displaystyle{ \operatorname{P}(|X|\geq t) = \operatorname{P}\left( \exp\left(\frac{X^2}{K_2^2}\right) \geq \exp\left(\frac{t^2}{K_2^2}\right) \right) \leq \frac{\operatorname{E}[\exp{(X^2/K_2^2)}]}{\exp\left(\frac{t^2}{K_2^2}\right)} \leq 2 \exp\left(-\frac{t^2}{K_2^2}\right). }[/math]
More equivalent definitions
The following properties are equivalent:
- The distribution of [math]\displaystyle{ X }[/math] is sub-Gaussian.
- Laplace transform condition: for some B, b > 0, [math]\displaystyle{ \operatorname{E} e^{\lambda (X-\operatorname{E}[X])} \leq Be^{\lambda^2 b} }[/math] holds for all [math]\displaystyle{ \lambda }[/math].
- Moment condition: for some K > 0, [math]\displaystyle{ \operatorname{E} |X|^p \leq K^p p^{p/2} }[/math] for all [math]\displaystyle{ p \geq 1 }[/math].
- Moment generating function condition: for some [math]\displaystyle{ L\gt 0 }[/math], [math]\displaystyle{ \operatorname{E}[\exp(\lambda^2 X^2)] \leq \exp(L^2\lambda^2) }[/math] for all [math]\displaystyle{ \lambda }[/math] such that [math]\displaystyle{ |\lambda| \leq \frac{1}{L} }[/math]. [2]
- Union bound condition: for some c > 0, [math]\displaystyle{ \ \operatorname{E}[\max\{|X_1 - \operatorname{E}[X]|,\ldots,|X_n - \operatorname{E}[X]|\}] \leq c \sqrt{\log n} }[/math] for all n > c, where [math]\displaystyle{ X_1, \ldots, X_n }[/math] are i.i.d copies of X.
Examples
A standard normal random variable [math]\displaystyle{ X\sim N(0,1) }[/math] is a sub-Gaussian random variable.
Let [math]\displaystyle{ X }[/math] be a random variable with symmetric Bernoulli distribution (or Rademacher distribution). That is, [math]\displaystyle{ X }[/math] takes values [math]\displaystyle{ -1 }[/math] and [math]\displaystyle{ 1 }[/math] with probabilities [math]\displaystyle{ 1/2 }[/math] each. Since [math]\displaystyle{ X^2=1 }[/math], it follows that [math]\displaystyle{ \Vert X \Vert_{gauss} = \inf\left\{ c\gt 0 : \operatorname{E}\left[\exp{\left(\frac{X^2}{c^2}\right)}\right] \leq 2 \right\} = \inf\left\{ c\gt 0 : \operatorname{E}\left[\exp{\left(\frac{1}{c^2}\right)}\right] \leq 2 \right\}=\frac{1}{\sqrt{\ln 2}}, }[/math]and hence [math]\displaystyle{ X }[/math] is a sub-Gaussian random variable.
Maximum of Sub-Gaussian Random Variables
Consider a finite collection of subgaussian random variables, X1, ..., Xn, with corresponding subgaussian parameters [math]\displaystyle{ \sigma }[/math]. The random variable Mn = max(X1, ..., Xn) represents the maximum of this collection. The expectation [math]\displaystyle{ M_n }[/math] can be bounded above by [math]\displaystyle{ \sqrt{2\sigma^2\log n} }[/math]. Note that no independence assumption is needed to form this bound.[1]
See also
Notes
- ↑ 1.0 1.1 Wainwright MJ. High-Dimensional Statistics: A Non-Asymptotic Viewpoint. Cambridge: Cambridge University Press; 2019. doi:10.1017/9781108627771, ISBN:9781108627771.
- ↑ Vershynin, R. (2018). High-dimensional probability: An introduction with applications in data science. Cambridge: Cambridge University Press. pp. 33–34.
References
- Kahane, J.P. (1960). "Propriétés locales des fonctions à séries de Fourier aléatoires". Studia Mathematica 19: 1–25. doi:10.4064/sm-19-1-1-25.
- Buldygin, V.V.; Kozachenko, Yu.V. (1980). "Sub-Gaussian random variables". Ukrainian Mathematical Journal 32 (6): 483–489. doi:10.1007/BF01087176.
- Ledoux, Michel; Talagrand, Michel (1991). Probability in Banach Spaces. Springer-Verlag.
- Stromberg, K.R. (1994). Probability for Analysts. Chapman & Hall/CRC.
- Litvak, A.E.; Pajor, A.; Rudelson, M.; Tomczak-Jaegermann, N. (2005). "Smallest singular value of random matrices and geometry of random polytopes". Advances in Mathematics 195 (2): 491–523. doi:10.1016/j.aim.2004.08.004. http://www.math.ualberta.ca/~alexandr/OrganizedPapers/lprtlastlast.pdf.
- Rudelson, Mark; Vershynin, Roman (2010). "Non-asymptotic theory of random matrices: extreme singular values". pp. 1576–1602. doi:10.1142/9789814324359_0111.
- Rivasplata, O. (2012). "Subgaussian random variables: An expository note". Unpublished. http://www.stat.cmu.edu/~arinaldo/36788/subgaussians.pdf.
- Vershynin, R. (2018). "High-dimensional probability: An introduction with applications in data science" (PDF). Volume 47 of Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, Cambridge.
- Zajkowskim, K. (2020). "On norms in some class of exponential type Orlicz spaces of random variables". Positivity. An International Mathematics Journal Devoted to Theory and Applications of Positivity. 24(5): 1231--1240. arXiv:1709.02970. doi.org/10.1007/s11117-019-00729-6.
Original source: https://en.wikipedia.org/wiki/Sub-Gaussian distribution.
Read more |