Law of total expectation

Short description: Proposition in probability theory

The proposition in probability theory known as the law of total expectation,^[1] the law of iterated expectations^[2] (LIE), Adam's law,^[3] the tower rule,^[4] and the smoothing theorem,^[5] among other names, states that if [math]\displaystyle{ X }[/math] is a random variable whose expected value [math]\displaystyle{ \operatorname{E}(X) }[/math] is defined, and [math]\displaystyle{ Y }[/math] is any random variable on the same probability space, then

[math]\displaystyle{ \operatorname{E} (X) = \operatorname{E} ( \operatorname{E} ( X \mid Y)), }[/math]

i.e., the expected value of the conditional expected value of [math]\displaystyle{ X }[/math] given [math]\displaystyle{ Y }[/math] is the same as the expected value of [math]\displaystyle{ X }[/math].

One special case states that if [math]\displaystyle{ {\left\{A_i\right\}}_i }[/math] is a finite or countable partition of the sample space, then

[math]\displaystyle{ \operatorname{E} (X) = \sum_i{\operatorname{E}(X \mid A_i) \operatorname{P}(A_i)}. }[/math]

Note: The conditional expected value E(X | Y), with Y a random variable, is not a simple number; it is a random variable whose value depends on the value of Y. That is, the conditional expected value of X given the event Y = y is a number and it is a function of y. If we write g(y) for the value of E(X | Y = y) then the random variable E(X | Y) is g(Y).

Example

Suppose that only two factories supply light bulbs to the market. Factory [math]\displaystyle{ X }[/math]'s bulbs work for an average of 5000 hours, whereas factory [math]\displaystyle{ Y }[/math]'s bulbs work for an average of 4000 hours. It is known that factory [math]\displaystyle{ X }[/math] supplies 60% of the total bulbs available. What is the expected length of time that a purchased bulb will work for?

Applying the law of total expectation, we have:

[math]\displaystyle{ \begin{align} \operatorname{E} (L) &= \operatorname{E}(L \mid X) \operatorname{P}(X)+\operatorname{E}(L \mid Y) \operatorname{P}(Y) \\[3pt] &= 5000(0.6)+4000(0.4)\\[2pt] &=4600 \end{align} }[/math]

where

[math]\displaystyle{ \operatorname{E} (L) }[/math] is the expected life of the bulb;
[math]\displaystyle{ \operatorname{P}(X)={6 \over 10} }[/math] is the probability that the purchased bulb was manufactured by factory [math]\displaystyle{ X }[/math];
[math]\displaystyle{ \operatorname{P}(Y)={4 \over 10} }[/math] is the probability that the purchased bulb was manufactured by factory [math]\displaystyle{ Y }[/math];
[math]\displaystyle{ \operatorname{E}(L \mid X)=5000 }[/math] is the expected lifetime of a bulb manufactured by [math]\displaystyle{ X }[/math];
[math]\displaystyle{ \operatorname{E}(L \mid Y)=4000 }[/math] is the expected lifetime of a bulb manufactured by [math]\displaystyle{ Y }[/math].

Thus each purchased light bulb has an expected lifetime of 4600 hours.

Informal proof

When a joint probability density function is well defined and the expectations are integrable, we write for the general case [math]\displaystyle{ \begin{align} \operatorname E(X) &= \int x \Pr[X=x] ~dx \\ \operatorname E(X\mid Y=y) &= \int x \Pr[X=x\mid Y=y] ~dx \\ \operatorname E( \operatorname E(X\mid Y)) &= \int \left(\int x \Pr[X=x\mid Y=y] ~dx \right) \Pr[Y=y] ~dy \\ &= \int \int x \Pr[X = x, Y= y] ~dx ~dy \\ &= \int x \left( \int \Pr[X = x, Y = y] ~dy \right) ~dx \\ &= \int x \Pr[X = x] ~dx \\ &= \operatorname E(X)\,.\end{align} }[/math] A similar derivation works for discrete distributions using summation instead of integration. For the specific case of a partition, give each cell of the partition a unique label and let the random variable Y be the function of the sample space that assigns a cell's label to each point in that cell.

Proof in the general case

Let [math]\displaystyle{ (\Omega,\mathcal{F},\operatorname{P}) }[/math] be a probability space on which two sub σ-algebras [math]\displaystyle{ \mathcal{G}_1 \subseteq \mathcal{G}_2 \subseteq \mathcal{F} }[/math] are defined. For a random variable [math]\displaystyle{ X }[/math] on such a space, the smoothing law states that if [math]\displaystyle{ \operatorname{E}[X] }[/math] is defined, i.e. [math]\displaystyle{ \min(\operatorname{E}[X_+], \operatorname{E}[X_-])\lt \infty }[/math], then

[math]\displaystyle{ \operatorname{E}[ \operatorname{E}[X \mid \mathcal{G}_2] \mid \mathcal{G}_1] = \operatorname{E}[X \mid \mathcal{G}_1]\quad\text{(a.s.)}. }[/math]

Proof. Since a conditional expectation is a Radon–Nikodym derivative, verifying the following two properties establishes the smoothing law:

[math]\displaystyle{ \operatorname{E}[ \operatorname{E}[X \mid \mathcal{G}_2] \mid \mathcal{G}_1] \mbox{ is } \mathcal{G}_1 }[/math]-measurable
[math]\displaystyle{ \int_{G_1} \operatorname{E}[ \operatorname{E}[X \mid \mathcal{G}_2] \mid \mathcal{G}_1] \, d\operatorname{P} = \int_{G_1} X \, d\operatorname{P}, }[/math] for all [math]\displaystyle{ G_1 \in \mathcal{G}_1. }[/math]

The first of these properties holds by definition of the conditional expectation. To prove the second one,

[math]\displaystyle{ \begin{align} \min\left(\int_{G_1}X_+\, d\operatorname{P}, \int_{G_1}X_-\, d\operatorname{P} \right) &\leq \min\left(\int_\Omega X_+\, d\operatorname{P}, \int_\Omega X_-\, d\operatorname{P}\right)\\[4pt] &=\min(\operatorname{E}[X_+], \operatorname{E}[X_-]) \lt \infty, \end{align} }[/math]

so the integral [math]\displaystyle{ \textstyle \int_{G_1}X\, d\operatorname{P} }[/math] is defined (not equal [math]\displaystyle{ \infty - \infty }[/math]).

The second property thus holds since [math]\displaystyle{ G_1 \in \mathcal{G}_1 \subseteq \mathcal{G}_2 }[/math] implies

[math]\displaystyle{ \int_{G_1} \operatorname{E}[ \operatorname{E}[X \mid \mathcal{G}_2] \mid \mathcal{G}_1] \, d\operatorname{P} = \int_{G_1} \operatorname{E}[X \mid \mathcal{G}_2] \, d\operatorname{P} = \int_{G_1} X \, d\operatorname{P}. }[/math]

Corollary. In the special case when [math]\displaystyle{ \mathcal{G}_1 = \{\empty,\Omega \} }[/math] and [math]\displaystyle{ \mathcal{G}_2 = \sigma(Y) }[/math], the smoothing law reduces to

[math]\displaystyle{ \operatorname{E}[ \operatorname{E}[X \mid Y]] = \operatorname{E}[X]. }[/math]

Alternative proof for [math]\displaystyle{ \operatorname{E}[ \operatorname{E}[X \mid Y]] = \operatorname{E}[X]. }[/math]

This is a simple consequence of the measure-theoretic definition of conditional expectation. By definition, [math]\displaystyle{ \operatorname{E}[X \mid Y] := \operatorname{E}[X \mid \sigma(Y)] }[/math] is a [math]\displaystyle{ \sigma(Y) }[/math]-measurable random variable that satisfies

[math]\displaystyle{ \int_A \operatorname{E}[X \mid Y] \, d\operatorname{P} = \int_A X \, d\operatorname{P}, }[/math]

for every measurable set [math]\displaystyle{ A \in \sigma(Y) }[/math]. Taking [math]\displaystyle{ A = \Omega }[/math] proves the claim.

References

↑ Weiss, Neil A. (2005). A Course in Probability. Boston: Addison–Wesley. pp. 380–383. ISBN 0-321-18954-X. https://books.google.com/books?id=p-rwJAAACAAJ&pg=PA380.
↑ "Law of Iterated Expectation | Brilliant Math & Science Wiki" (in en-us). https://brilliant.org/wiki/law-of-iterated-expectation/.
↑ "Adam's and Eve's Laws". https://r.amherst.edu/apps/nhorton/Adam-Eve/.
↑ Rhee, Chang-han (Sep 20, 2011). "Probability and Statistics". https://web.stanford.edu/class/cme001/handouts/changhan/Refresher2.pdf.
↑ Wolpert, Robert (November 18, 2010). "Conditional Expectation". https://www2.stat.duke.edu/courses/Fall10/sta205/lec/topics/rn.pdf.

Billingsley, Patrick (1995). Probability and measure. New York: John Wiley & Sons. ISBN 0-471-00710-2. (Theorem 34.4)
Christopher Sims, "Notes on Random Variables, Expectations, Probability Densities, and Martingales", especially equations (16) through (18)

0.00

(0 votes)

Original source: https://en.wikipedia.org/wiki/Law of total expectation. Read more

[1] Weiss, Neil A. (2005). A Course in Probability. Boston: Addison–Wesley. pp. 380–383. ISBN 0-321-18954-X. https://books.google.com/books?id=p-rwJAAACAAJ&pg=PA380.

[2] "Law of Iterated Expectation | Brilliant Math & Science Wiki" (in en-us). https://brilliant.org/wiki/law-of-iterated-expectation/.

[3] "Adam's and Eve's Laws". https://r.amherst.edu/apps/nhorton/Adam-Eve/.

[4] Rhee, Chang-han (Sep 20, 2011). "Probability and Statistics". https://web.stanford.edu/class/cme001/handouts/changhan/Refresher2.pdf.

[5] Wolpert, Robert (November 18, 2010). "Conditional Expectation". https://www2.stat.duke.edu/courses/Fall10/sta205/lec/topics/rn.pdf.

[1]

[2]

[3]

[4]

[5]

Anonymous

Search

Law of total expectation

Namespaces

More

Page actions

Contents

Example

Informal proof

Proof in the general case

See also

References

Navigation

Navigation

Help

Translate

Wiki tools

Wiki tools

Anonymous

Search

Law of total expectation

Example

Informal proof

Proof in the general case

See also

References

Navigation

Wiki tools

Page tools

Other projects

Categories