# Law of the unconscious statistician

In probability theory and statistics, the **law of the unconscious statistician**, or **LOTUS**, is a theorem used to calculate the expected value of a function *g*(*X*) of a random variable *X* when one knows the probability distribution of *X* but one does not know the distribution of *g*(*X*). The form of the law can depend on the form in which one states the probability distribution of the random variable *X*. If it is a discrete distribution and one knows its probability mass function *ƒ _{X}* (but not

*ƒ*

_{g(X)}), then the expected value of

*g*(

*X*) is

- [math]\displaystyle{ \operatorname{E}[g(X)] = \sum_x g(x) f_X(x), \, }[/math]

where the sum is over all possible values *x* of *X*. If it is a continuous distribution and one knows its probability density function *ƒ*_{X} (but not *ƒ*_{g(X)}), then the expected value of *g*(*X*) is

- [math]\displaystyle{ \operatorname{E}[g(X)] = \int_{-\infty}^\infty g(x) f_X(x) \, \mathrm{d}x }[/math]

If one knows the cumulative probability distribution function *F*_{X} (but not *F*_{g(X)}), then the expected value of *g*(*X*) is given by a Riemann–Stieltjes integral

- [math]\displaystyle{ \operatorname{E}[g(X)] = \int_{-\infty}^\infty g(x) \, \mathrm{d}F_X(x) }[/math]

(again assuming *X* is real-valued).^{[1]}^{[2]}^{[3]}^{[4]}

## Etymology

This proposition is known as the law of the unconscious statistician because of a purported tendency to use the identity without realizing that it must be treated as the result of a rigorously proved theorem, not merely a definition.^{[4]}

## Joint distributions

A similar property holds for joint distributions. For discrete random variables *X* and *Y*, a function of two variables *g*, and joint probability mass function *f*(*x*, *y*):^{[5]}

- [math]\displaystyle{ \operatorname{E}[g(X, Y)] = \sum_y \sum_x g(x, y) f(x, y) }[/math]

In the absolutely continuous case, with *f*(*x*, *y*) being the joint probability density function,

- [math]\displaystyle{ \operatorname{E}[g(X, Y)] = \int_{-\infty}^\infty \int_{-\infty}^\infty g(x, y) f(x, y) \, \mathrm{d}x \, \mathrm{d}y }[/math]

## Proof

This law is not a trivial result of definitions as it might at first appear, but rather must be proved.^{[5]}^{[6]}^{[7]}

### Continuous case

For a continuous random variable *X*, let *Y* = *g*(*X*), and suppose that *g* is differentiable and that its inverse *g*^{−1} is monotonic. By the formula for inverse functions and differentiation,

- [math]\displaystyle{ \frac{d}{dy}(g^{-1}(y)) = \frac{1}{g^{\prime}(g^{-1}(y))} }[/math]

Because *x* = *g*^{−1}(*y*),

- [math]\displaystyle{ dx = \frac{1}{g^{\prime}(g^{-1}(y))}dy }[/math]

So that by a change of variables,

- [math]\displaystyle{ \int_{-\infty}^\infty g(x)f_X(x) \, dx = \int_{-\infty}^\infty yf_X(g^{-1}(y))\frac{1}{g^\prime(g^{-1}(y))} \, dy }[/math]

Now, notice that because the cumulative distribution function [math]\displaystyle{ F_Y(y) = P(Y \leq y) }[/math], substituting in the value of *g*(*X*), taking the inverse of both sides, and rearranging yields [math]\displaystyle{ F_Y(y) = F_X(g^{-1}(y)) }[/math]. Then, by the chain rule,

- [math]\displaystyle{ f_Y(y) = f_X(g^{-1}(y))\frac{1}{g^\prime(g^{-1}(y))} }[/math]

Combining these expressions, we find

- [math]\displaystyle{ \int_{-\infty}^\infty g(x)f_X(x) \, dx = \int_{-\infty}^\infty yf_Y(y) \, dy }[/math]

By the definition of expected value,

- [math]\displaystyle{ \operatorname E[g(X)] = \int_{-\infty}^\infty g(x)f_X(x) \, dx }[/math]

### Discrete case

Let [math]\displaystyle{ Y = g(X) }[/math]. Then begin with the definition of expected value.

- [math]\displaystyle{ \operatorname E[Y] = \sum_y yf_{Y}(y) }[/math]

- [math]\displaystyle{ \operatorname E[g(X)] = \sum_y yP(g(X) = y) }[/math]

- [math]\displaystyle{ \operatorname E[g(X)] = \sum_y y \sum_{x\,:\,g(x) = y} f_X(x) }[/math]

- [math]\displaystyle{ \operatorname E[g(X)] = \sum_x g(x)f_X(x) }[/math]

### From measure theory

A technically complete derivation of the result is available using arguments in measure theory, in which the probability space of a transformed random variable *g(X)* is related to that of the original random variable *X*. The steps here involve defining a pushforward measure for the transformed space, and the result is then an example of a change of variables formula. ^{[5]}

- [math]\displaystyle{ \int_\Omega g \circ X \, \mathrm{d}P = \int_{\Omega_{X}} g \, \mathrm{d}(X_* P) }[/math]

We say [math]\displaystyle{ X:(\Omega, \Sigma, P)\to (\Omega_{X}, \Sigma_{X}) }[/math] has a density if the pushforward measure [math]\displaystyle{ \mathrm{d}(X_* P) }[/math] is absolutely continuous with respect to the Lebesgue measure [math]\displaystyle{ \mu }[/math]. In that case,

- [math]\displaystyle{ \mathrm{d}(X_*P) = f \, \mathrm{d} \mu, }[/math]

where [math]\displaystyle{ f : {\Omega_X} \to \mathbb{R} }[/math] is the density (see Radon-Nikodym derivative). So the above can be rewritten as the more familiar

- [math]\displaystyle{ \operatorname{E}[g(X)] = \int_{\Omega} g \circ X \, \mathrm{d}P = \int_{\Omega_{X}} g(x) f(x) \, \mathrm{d}x . }[/math]

## References

- ↑ Eric Key (1998) Lecture 6: Random variables , Lecture notes, University of Leeds
- ↑ Bengt Ringner (2009) "Law of the unconscious statistician", unpublished note, Centre for Mathematical Sciences, Lund University
- ↑ Blitzstein, Joseph K.; Hwang, Jessica (2014).
*Introduction to Probability*(1st ed.). Chapman and Hall. p. 156. - ↑
^{4.0}^{4.1}DeGroot, Morris; Schervish, Mark (2014).*Probability and Statistics*(4th ed.). Pearson Education Limited. p. 213. - ↑
^{5.0}^{5.1}^{5.2}Ross, Sheldon M. (2010).*Introduction to Probability Models*(10th ed.). Elsevier, Inc.. - ↑ Virtual Laboratories in Probability and Statistics, Sect. 3.1 "Expected Value: Definition and Properties", item "Basic Results: Change of Variables Theorem".
- ↑ Rumbos, Adolfo J. (2008). "Probability lecture notes". http://pages.pomona.edu/~ajr04747/Spring2008/Math151/Math151NotesSpring08.pdf.

Original source: https://en.wikipedia.org/wiki/Law of the unconscious statistician.
Read more |