# Law of total probability

Short description: Concept in probability theory

In probability theory, the law (or formula) of total probability is a fundamental rule relating marginal probabilities to conditional probabilities. It expresses the total probability of an outcome which can be realized via several distinct events, hence the name.

## Statement

The law of total probability is[1] a theorem that states, in its discrete case, if $\displaystyle{ \left\{{B_n : n = 1, 2, 3, \ldots}\right\} }$ is a finite or countably infinite partition of a sample space (in other words, a set of pairwise disjoint events whose union is the entire sample space) and each event $\displaystyle{ B_n }$ is measurable, then for any event $\displaystyle{ A }$ of the same sample space:

$\displaystyle{ P(A)=\sum_n P(A\cap B_n) }$

or, alternatively,[1]

$\displaystyle{ P(A)=\sum_n P(A\mid B_n)P(B_n), }$

where, for any $\displaystyle{ n }$, if $\displaystyle{ P(B_n) = 0 }$, then these terms are simply omitted from the summation since $\displaystyle{ P(A\mid B_n) }$ is finite.

The summation can be interpreted as a weighted average, and consequently the marginal probability, $\displaystyle{ P(A) }$, is sometimes called "average probability";[2] "overall probability" is sometimes used in less formal writings.[3]

The law of total probability can also be stated for conditional probabilities:

$\displaystyle{ P( {A|C} ) = \frac{{P( {A,C} )}}{{P( C )}} = \frac{{\sum\limits_n {P( {A,{B_n},C} )} }}{{P( C )}} = \frac{{\sum\limits_n P ( {A\mid {B_n},C} )P( {{B_n}\mid C} )P( C )}}{{P( C )}} = \sum\limits_n P ( {A\mid {B_n},C} )P( {{B_n}\mid C} ) }$

Taking the $\displaystyle{ B_n }$ as above, and assuming $\displaystyle{ C }$ is an event independent of any of the $\displaystyle{ B_n }$:

$\displaystyle{ P(A \mid C) = \sum_n P(A \mid C,B_n) P(B_n) }$

## Continuous case

The law of total probability extends to the case of conditioning on events generated by continuous random variables. Let $\displaystyle{ (\Omega, \mathcal{F}, P) }$ be a probability space. Suppose $\displaystyle{ X }$ is a random variable with distribution function $\displaystyle{ F_X }$, and $\displaystyle{ A }$ an event on $\displaystyle{ (\Omega, \mathcal{F}, P) }$. Then the law of total probability states

$\displaystyle{ P(A) = \int_{-\infty}^\infty P(A |X = x) d F_X(x). }$

If $\displaystyle{ X }$ admits a density function $\displaystyle{ f_X }$, then the result is

$\displaystyle{ P(A) = \int_{-\infty}^\infty P(A |X = x) f_X(x) dx. }$

Moreover, for the specific case where $\displaystyle{ A = \{Y \in B \} }$, where $\displaystyle{ B }$ is a Borel set, then this yields

$\displaystyle{ P(Y \in B) = \int_{-\infty}^\infty P(Y \in B |X = x) f_X(x) dx. }$

## Example

Suppose that two factories supply light bulbs to the market. Factory X's bulbs work for over 5000 hours in 99% of cases, whereas factory Y's bulbs work for over 5000 hours in 95% of cases. It is known that factory X supplies 60% of the total bulbs available and Y supplies 40% of the total bulbs available. What is the chance that a purchased bulb will work for longer than 5000 hours?

Applying the law of total probability, we have:

\displaystyle{ \begin{align} P(A) & = P(A\mid B_X) \cdot P(B_X) + P(A\mid B_Y) \cdot P(B_Y) \\[4pt] & = {99 \over 100} \cdot {6 \over 10} + {95 \over 100} \cdot {4 \over 10} = {{594 + 380} \over 1000} = {974 \over 1000} \end{align} }

where

• $\displaystyle{ P(B_X)={6 \over 10} }$ is the probability that the purchased bulb was manufactured by factory X;
• $\displaystyle{ P(B_Y)={4 \over 10} }$ is the probability that the purchased bulb was manufactured by factory Y;
• $\displaystyle{ P(A\mid B_X)={99 \over 100} }$ is the probability that a bulb manufactured by X will work for over 5000 hours;
• $\displaystyle{ P(A\mid B_Y)={95 \over 100} }$ is the probability that a bulb manufactured by Y will work for over 5000 hours.

Thus each purchased light bulb has a 97.4% chance to work for more than 5000 hours.

## Other names

The term law of total probability is sometimes taken to mean the law of alternatives, which is a special case of the law of total probability applying to discrete random variables.[citation needed] One author uses the terminology of the "Rule of Average Conditional Probabilities",[4] while another refers to it as the "continuous law of alternatives" in the continuous case.[5] This result is given by Grimmett and Welsh[6] as the partition theorem, a name that they also give to the related law of total expectation.

## Notes

1. Zwillinger, D., Kokoska, S. (2000) CRC Standard Probability and Statistics Tables and Formulae, CRC Press. ISBN 1-58488-059-7 page 31.
2. Paul E. Pfeiffer (1978). Concepts of probability theory. Courier Dover Publications. pp. 47–48. ISBN 978-0-486-63677-1.
3. Deborah Rumsey (2006). Probability for dummies. For Dummies. p. 58. ISBN 978-0-471-75141-0.
4. Jim Pitman (1993). Probability. Springer. p. 41. ISBN 0-387-97974-3.
5. Kenneth Baclawski (2008). Introduction to probability with R. CRC Press. p. 179. ISBN 978-1-4200-6521-3.
6. Probability: An Introduction, by Geoffrey Grimmett and Dominic Welsh, Oxford Science Publications, 1986, Theorem 1B.

## References

• Introduction to Probability and Statistics by Robert J. Beaver, Barbara M. Beaver, Thomson Brooks/Cole, 2005, page 159.
• Theory of Statistics, by Mark J. Schervish, Springer, 1995.
• Schaum's Outline of Probability, Second Edition, by John J. Schiller, Seymour Lipschutz, McGraw–Hill Professional, 2010, page 89.
• A First Course in Stochastic Models, by H. C. Tijms, John Wiley and Sons, 2003, pages 431–432.
• An Intermediate Course in Probability, by Alan Gut, Springer, 1995, pages 5–6.