Pi-system

From HandWiki
Short description: Family of sets closed under intersection

In mathematics, a π-system (or pi-system) on a set [math]\displaystyle{ \Omega }[/math] is a collection [math]\displaystyle{ P }[/math] of certain subsets of [math]\displaystyle{ \Omega, }[/math] such that

  • [math]\displaystyle{ P }[/math] is non-empty.
  • If [math]\displaystyle{ A, B \in P }[/math] then [math]\displaystyle{ A \cap B \in P. }[/math]

That is, [math]\displaystyle{ P }[/math] is a non-empty family of subsets of [math]\displaystyle{ \Omega }[/math] that is closed under non-empty finite intersections.[nb 1] The importance of π-systems arises from the fact that if two probability measures agree on a π-system, then they agree on the 𝜎-algebra generated by that π-system. Moreover, if other properties, such as equality of integrals, hold for the π-system, then they hold for the generated 𝜎-algebra as well. This is the case whenever the collection of subsets for which the property holds is a 𝜆-system. π-systems are also useful for checking independence of random variables.

This is desirable because in practice, π-systems are often simpler to work with than 𝜎-algebras. For example, it may be awkward to work with 𝜎-algebras generated by infinitely many sets [math]\displaystyle{ \sigma(E_1, E_2, \ldots). }[/math] So instead we may examine the union of all 𝜎-algebras generated by finitely many sets [math]\displaystyle{ \bigcup_n \sigma(E_1, \ldots, E_n). }[/math] This forms a π-system that generates the desired 𝜎-algebra. Another example is the collection of all intervals of the real line, along with the empty set, which is a π-system that generates the very important Borel 𝜎-algebra of subsets of the real line.

Definitions

A π-system is a non-empty collection of sets [math]\displaystyle{ P }[/math] that is closed under non-empty finite intersections, which is equivalent to [math]\displaystyle{ P }[/math] containing the intersection of any two of its elements. If every set in this π-system is a subset of [math]\displaystyle{ \Omega }[/math] then it is called a π-system on [math]\displaystyle{ \Omega. }[/math]

For any non-empty family [math]\displaystyle{ \Sigma }[/math] of subsets of [math]\displaystyle{ \Omega, }[/math] there exists a π-system [math]\displaystyle{ \mathcal{I}_{\Sigma}, }[/math] called the π-system generated by [math]\displaystyle{ \boldsymbol\varSigma }[/math], that is the unique smallest π-system of [math]\displaystyle{ \Omega }[/math] containing every element of [math]\displaystyle{ \Sigma. }[/math] It is equal to the intersection of all π-systems containing [math]\displaystyle{ \Sigma, }[/math] and can be explicitly described as the set of all possible non-empty finite intersections of elements of [math]\displaystyle{ \Sigma: }[/math] [math]\displaystyle{ \left\{E_1 \cap \cdots \cap E_n ~:~ 1 \leq n \in \N \text{ and } E_1, \ldots, E_n \in \Sigma\right\}. }[/math]

A non-empty family of sets has the finite intersection property if and only if the π-system it generates does not contain the empty set as an element.

Examples

  • For any real numbers [math]\displaystyle{ a }[/math] and [math]\displaystyle{ b, }[/math] the intervals [math]\displaystyle{ (-\infty, a] }[/math] form a π-system, and the intervals [math]\displaystyle{ (a, b] }[/math] form a π-system if the empty set is also included.
  • The topology (collection of open subsets) of any topological space is a π-system.
  • Every filter is a π-system. Every π-system that doesn't contain the empty set is a prefilter (also known as a filter base).
  • For any measurable function [math]\displaystyle{ f : \Omega \to \Reals, }[/math] the set  [math]\displaystyle{ \mathcal{I}_f = \left\{f^{-1}((-\infty, x]) : x \in \Reals\right\} }[/math] defines a π-system, and is called the π-system generated by [math]\displaystyle{ f. }[/math] (Alternatively, [math]\displaystyle{ \left\{f^{-1}((a, b]) : a, b \in \Reals, a \lt b\right\} \cup \{\varnothing\} }[/math] defines a π-system generated by [math]\displaystyle{ f. }[/math])
  • If [math]\displaystyle{ P_1 }[/math] and [math]\displaystyle{ P_2 }[/math] are π-systems for [math]\displaystyle{ \Omega_1 }[/math] and [math]\displaystyle{ \Omega_2, }[/math] respectively, then [math]\displaystyle{ \{A_1 \times A_2 : A_1 \in P_1, A_2 \in P_2\} }[/math] is a π-system for the Cartesian product [math]\displaystyle{ \Omega_1 \times \Omega_2. }[/math]
  • Every 𝜎-algebra is a π-system.

Relationship to 𝜆-systems

A 𝜆-system on [math]\displaystyle{ \Omega }[/math] is a set [math]\displaystyle{ D }[/math] of subsets of [math]\displaystyle{ \Omega, }[/math] satisfying

  • [math]\displaystyle{ \Omega \in D, }[/math]
  • if [math]\displaystyle{ A \in D }[/math] then [math]\displaystyle{ \Omega \setminus A \in D, }[/math]
  • if [math]\displaystyle{ A_1, A_2, A_3, \ldots }[/math] is a sequence of (pairwise) disjoint subsets in [math]\displaystyle{ D }[/math] then [math]\displaystyle{ \textstyle\bigcup\limits_{n=1}^\infty A_n \in D. }[/math]

Whilst it is true that any 𝜎-algebra satisfies the properties of being both a π-system and a 𝜆-system, it is not true that any π-system is a 𝜆-system, and moreover it is not true that any π-system is a 𝜎-algebra. However, a useful classification is that any set system which is both a 𝜆-system and a π-system is a 𝜎-algebra. This is used as a step in proving the π-𝜆 theorem.

The π-𝜆 theorem

Let [math]\displaystyle{ D }[/math] be a 𝜆-system, and let  [math]\displaystyle{ \mathcal{I} \subseteq D }[/math] be a π-system contained in [math]\displaystyle{ D. }[/math] The π-𝜆 theorem[1] states that the 𝜎-algebra [math]\displaystyle{ \sigma(\mathcal{I}) }[/math] generated by [math]\displaystyle{ \mathcal{I} }[/math] is contained in [math]\displaystyle{ D ~:~ }[/math] [math]\displaystyle{ \sigma(\mathcal{I}) \subseteq D. }[/math]

The π-𝜆 theorem can be used to prove many elementary measure theoretic results. For instance, it is used in proving the uniqueness claim of the Carathéodory extension theorem for 𝜎-finite measures.[2]

The π-𝜆 theorem is closely related to the monotone class theorem, which provides a similar relationship between monotone classes and algebras, and can be used to derive many of the same results. Since π-systems are simpler classes than algebras, it can be easier to identify the sets that are in them while, on the other hand, checking whether the property under consideration determines a 𝜆-system is often relatively easy. Despite the difference between the two theorems, the π-𝜆 theorem is sometimes referred to as the monotone class theorem.[1]

Example

Let [math]\displaystyle{ \mu_1, \mu_2 : F \to \Reals }[/math] be two measures on the 𝜎-algebra [math]\displaystyle{ F, }[/math] and suppose that [math]\displaystyle{ F = \sigma(I) }[/math] is generated by a π-system [math]\displaystyle{ I. }[/math] If

  1. [math]\displaystyle{ \mu_1(A) = \mu_2(A) }[/math] for all [math]\displaystyle{ A \in I, }[/math] and
  2. [math]\displaystyle{ \mu_1(\Omega) = \mu_2(\Omega) \lt \infty, }[/math]

then [math]\displaystyle{ \mu_1 = \mu_2. }[/math] This is the uniqueness statement of the Carathéodory extension theorem for finite measures. If this result does not seem very remarkable, consider the fact that it usually is very difficult or even impossible to fully describe every set in the 𝜎-algebra, and so the problem of equating measures would be completely hopeless without such a tool.

Idea of the proof[2] Define the collection of sets [math]\displaystyle{ D = \left\{ A \in \sigma(I) \colon \mu_1(A) = \mu_2(A) \right\}. }[/math] By the first assumption, [math]\displaystyle{ \mu_1 }[/math] and [math]\displaystyle{ \mu_2 }[/math] agree on [math]\displaystyle{ I }[/math] and thus [math]\displaystyle{ I \subseteq D. }[/math] By the second assumption, [math]\displaystyle{ \Omega \in D, }[/math] and it can further be shown that [math]\displaystyle{ D }[/math] is a 𝜆-system. It follows from the π-𝜆 theorem that [math]\displaystyle{ \sigma(I) \subseteq D \subseteq \sigma(I), }[/math] and so [math]\displaystyle{ D = \sigma(I). }[/math] That is to say, the measures agree on [math]\displaystyle{ \sigma(I). }[/math]

π-Systems in probability

π-systems are more commonly used in the study of probability theory than in the general field of measure theory. This is primarily due to probabilistic notions such as independence, though it may also be a consequence of the fact that the π-𝜆 theorem was proven by the probabilist Eugene Dynkin. Standard measure theory texts typically prove the same results via monotone classes, rather than π-systems.

Equality in distribution

The π-𝜆 theorem motivates the common definition of the probability distribution of a random variable [math]\displaystyle{ X : (\Omega, \mathcal F, \operatorname P) \to \Reals }[/math] in terms of its cumulative distribution function. Recall that the cumulative distribution of a random variable is defined as [math]\displaystyle{ F_X(a) = \operatorname{P}[X \leq a], \qquad a \in \Reals, }[/math] whereas the seemingly more general law of the variable is the probability measure [math]\displaystyle{ \mathcal{L}_X(B) = \operatorname{P}\left[X^{-1}(B)\right] \quad \text{ for all } B \in \mathcal{B}(\Reals), }[/math] where [math]\displaystyle{ \mathcal{B}(\Reals) }[/math] is the Borel 𝜎-algebra. The random variables [math]\displaystyle{ X :(\Omega, \mathcal F, \operatorname P) \to \Reals }[/math] and [math]\displaystyle{ Y : (\tilde\Omega,\tilde{\mathcal F}, \tilde{\operatorname P}) \to \Reals }[/math] (on two possibly different probability spaces) are equal in distribution (or law), denoted by [math]\displaystyle{ X \,\stackrel{\mathcal D}{=}\, Y, }[/math] if they have the same cumulative distribution functions; that is, if [math]\displaystyle{ F_X = F_Y. }[/math] The motivation for the definition stems from the observation that if [math]\displaystyle{ F_X = F_Y, }[/math] then that is exactly to say that [math]\displaystyle{ \mathcal{L}_X }[/math] and [math]\displaystyle{ \mathcal{L}_Y }[/math] agree on the π-system [math]\displaystyle{ \{(-\infty, a] : a \in \Reals\} }[/math] which generates [math]\displaystyle{ \mathcal{B}(\Reals), }[/math] and so by the example above: [math]\displaystyle{ \mathcal{L}_X = \mathcal{L}_Y. }[/math]

A similar result holds for the joint distribution of a random vector. For example, suppose [math]\displaystyle{ X }[/math] and [math]\displaystyle{ Y }[/math] are two random variables defined on the same probability space [math]\displaystyle{ (\Omega, \mathcal{F}, \operatorname{P}), }[/math] with respectively generated π-systems [math]\displaystyle{ \mathcal{I}_X }[/math] and [math]\displaystyle{ \mathcal{I}_Y. }[/math] The joint cumulative distribution function of [math]\displaystyle{ (X, Y) }[/math] is [math]\displaystyle{ F_{X,Y}(a, b) = \operatorname{P}[X \leq a, Y \leq b] = \operatorname{P}\left[X^{-1}((-\infty, a]) \cap Y^{-1}((-\infty, b])\right], \quad \text{ for all } a, b \in \Reals. }[/math]

However, [math]\displaystyle{ A = X^{-1}((-\infty, a]) \in \mathcal{I}_X }[/math] and [math]\displaystyle{ B = Y^{-1}((-\infty, b]) \in \mathcal{I}_Y. }[/math] Because [math]\displaystyle{ \mathcal{I}_{X,Y} = \left\{A \cap B : A \in \mathcal{I}_X, \text{ and } B \in \mathcal{I}_Y\right\} }[/math] is a π-system generated by the random pair [math]\displaystyle{ (X, Y), }[/math] the π-𝜆 theorem is used to show that the joint cumulative distribution function suffices to determine the joint law of [math]\displaystyle{ (X, Y). }[/math] In other words, [math]\displaystyle{ (X, Y) }[/math] and [math]\displaystyle{ (W, Z) }[/math] have the same distribution if and only if they have the same joint cumulative distribution function.

In the theory of stochastic processes, two processes [math]\displaystyle{ (X_t)_{t \in T}, (Y_t)_{t \in T} }[/math] are known to be equal in distribution if and only if they agree on all finite-dimensional distributions; that is, for all [math]\displaystyle{ t_1, \ldots, t_n \in T, \, n \in \N, }[/math] [math]\displaystyle{ \left(X_{t_1}, \ldots, X_{t_n}\right) \,\stackrel{\mathcal{D}}{=}\, \left(Y_{t_1}, \ldots, Y_{t_n}\right). }[/math]

The proof of this is another application of the π-𝜆 theorem.[3]

Independent random variables

The theory of π-system plays an important role in the probabilistic notion of independence. If [math]\displaystyle{ X }[/math] and [math]\displaystyle{ Y }[/math] are two random variables defined on the same probability space [math]\displaystyle{ (\Omega, \mathcal{F}, \operatorname{P}) }[/math] then the random variables are independent if and only if their π-systems [math]\displaystyle{ \mathcal{I}_X, \mathcal{I}_Y }[/math] satisfy for all [math]\displaystyle{ A \in \mathcal{I}_X }[/math] and [math]\displaystyle{ B \in \mathcal{I}_Y, }[/math] [math]\displaystyle{ \operatorname{P}[A \cap B] ~=~ \operatorname{P}[A] \operatorname{P}[B], }[/math] which is to say that [math]\displaystyle{ \mathcal{I}_X, \mathcal{I}_Y }[/math] are independent. This actually is a special case of the use of π-systems for determining the distribution of [math]\displaystyle{ (X, Y). }[/math]

Example

Let [math]\displaystyle{ Z = \left(Z_1, Z_2\right), }[/math] where [math]\displaystyle{ Z_1, Z_2 \sim \mathcal{N}(0, 1) }[/math] are iid standard normal random variables. Define the radius and argument (arctan) variables [math]\displaystyle{ R = \sqrt{Z_1^2 + Z_2^2}, \qquad \Theta = \tan^{-1}\left(Z_2 / Z_1\right). }[/math]

Then [math]\displaystyle{ R }[/math] and [math]\displaystyle{ \Theta }[/math] are independent random variables.

To prove this, it is sufficient to show that the π-systems [math]\displaystyle{ \mathcal{I}_R, \mathcal{I}_\Theta }[/math] are independent: that is, for all [math]\displaystyle{ \rho \in [0, \infty) }[/math] and [math]\displaystyle{ \theta \in [0, 2 \pi], }[/math] [math]\displaystyle{ \operatorname{P}[R \leq \rho, \Theta \leq \theta] = \operatorname{P}[R \leq \rho] \operatorname{P}[\Theta \leq \theta]. }[/math]

Confirming that this is the case is an exercise in changing variables. Fix [math]\displaystyle{ \rho \in [0, \infty) }[/math] and [math]\displaystyle{ \theta \in [0, 2 \pi], }[/math] then the probability can be expressed as an integral of the probability density function of [math]\displaystyle{ Z. }[/math] [math]\displaystyle{ \begin{align} \operatorname P [R \leq \rho, \Theta \leq \theta] &= \int_{R \leq \rho, \, \Theta \leq \theta} \frac{1}{2\pi}\exp\left({-\frac12(z_1^2 + z_2^2)}\right) dz_1 \, dz_2 \\[5pt] & = \int_0^{\theta} \int_0^\rho \frac{1}{2 \pi}e^{-\frac{r^2}{2}} \; r \, dr \, d\tilde\theta \\[5pt] & = \left(\int_0^\theta \frac{1}{2 \pi} \, d\tilde \theta\right) \; \left(\int_0^\rho e^{-\frac{r^2}{2}} \; r \, dr\right) \\[5pt] & = \operatorname P[\Theta \leq \theta]\operatorname P[R \leq \rho]. \end{align} }[/math]

See also

Notes

  1. The nullary (0-ary) intersection of subsets of [math]\displaystyle{ \Omega }[/math] is by convention equal to [math]\displaystyle{ \Omega, }[/math] which is not required to be an element of a π-system.

Citations

  1. 1.0 1.1 Kallenberg, Foundations Of Modern Probability, p. 2
  2. 2.0 2.1 Durrett, Probability Theory and Examples, p. 404
  3. Kallenberg, Foundations Of Modern probability, p. 48

References