Continuous mapping theorem
In probability theory, the continuous mapping theorem states that continuous functions preserve limits even if their arguments are sequences of random variables. A continuous function, in Heine's definition, is such a function that maps convergent sequences into convergent sequences: if xn → x then g(xn) → g(x). The continuous mapping theorem states that this will also be true if we replace the deterministic sequence {xn} with a sequence of random variables {Xn}, and replace the standard notion of convergence of real numbers “→” with one of the types of convergence of random variables.
This theorem was first proved by Henry Mann and Abraham Wald in 1943,[1] and it is therefore sometimes called the Mann–Wald theorem.[2] Meanwhile, Denis Sargan refers to it as the general transformation theorem.[3]
Statement
Let {Xn}, X be random elements defined on a metric space S. Suppose a function g: S→S′ (where S′ is another metric space) has the set of discontinuity points Dg such that Pr[X ∈ Dg] = 0. Then[4][5]
- [math]\displaystyle{ \begin{align} X_n \ \xrightarrow\text{d}\ X \quad & \Rightarrow\quad g(X_n)\ \xrightarrow\text{d}\ g(X); \\[6pt] X_n \ \xrightarrow\text{p}\ X \quad & \Rightarrow\quad g(X_n)\ \xrightarrow\text{p}\ g(X); \\[6pt] X_n \ \xrightarrow{\!\!\text{a.s.}\!\!}\ X \quad & \Rightarrow\quad g(X_n)\ \xrightarrow{\!\!\text{a.s.}\!\!}\ g(X). \end{align} }[/math]
where the superscripts, "d", "p", and "a.s." denote convergence in distribution, convergence in probability, and almost sure convergence respectively.
Proof
Spaces S and S′ are equipped with certain metrics. For simplicity we will denote both of these metrics using the |x − y| notation, even though the metrics may be arbitrary and not necessarily Euclidean.
Convergence in distribution
We will need a particular statement from the portmanteau theorem: that convergence in distribution [math]\displaystyle{ X_n\xrightarrow{d}X }[/math] is equivalent to
- [math]\displaystyle{ \mathbb E f(X_n) \to \mathbb E f(X) }[/math] for every bounded continuous functional f.
So it suffices to prove that [math]\displaystyle{ \mathbb E f(g(X_n)) \to \mathbb E f(g(X)) }[/math] for every bounded continuous functional f. Note that [math]\displaystyle{ F = f \circ g }[/math] is itself a bounded continuous functional. And so the claim follows from the statement above.
Convergence in probability
Fix an arbitrary ε > 0. Then for any δ > 0 consider the set Bδ defined as
- [math]\displaystyle{ B_\delta = \big\{x\in S \mid x\notin D_g:\ \exists y\in S:\ |x-y|\lt \delta,\, |g(x)-g(y)|\gt \varepsilon\big\}. }[/math]
This is the set of continuity points x of the function g(·) for which it is possible to find, within the δ-neighborhood of x, a point which maps outside the ε-neighborhood of g(x). By definition of continuity, this set shrinks as δ goes to zero, so that limδ → 0Bδ = ∅.
Now suppose that |g(X) − g(Xn)| > ε. This implies that at least one of the following is true: either |X−Xn| ≥ δ, or X ∈ Dg, or X∈Bδ. In terms of probabilities this can be written as
- [math]\displaystyle{ \Pr\big(\big|g(X_n)-g(X)\big|\gt \varepsilon\big) \leq \Pr\big(|X_n-X|\geq\delta\big) + \Pr(X\in B_\delta) + \Pr(X\in D_g). }[/math]
On the right-hand side, the first term converges to zero as n → ∞ for any fixed δ, by the definition of convergence in probability of the sequence {Xn}. The second term converges to zero as δ → 0, since the set Bδ shrinks to an empty set. And the last term is identically equal to zero by assumption of the theorem. Therefore, the conclusion is that
- [math]\displaystyle{ \lim_{n\to\infty}\Pr \big(\big|g(X_n)-g(X)\big|\gt \varepsilon\big) = 0, }[/math]
which means that g(Xn) converges to g(X) in probability.
Almost sure convergence
By definition of the continuity of the function g(·),
- [math]\displaystyle{ \lim_{n\to\infty}X_n(\omega) = X(\omega) \quad\Rightarrow\quad \lim_{n\to\infty}g(X_n(\omega)) = g(X(\omega)) }[/math]
at each point X(ω) where g(·) is continuous. Therefore,
- [math]\displaystyle{ \begin{align} \Pr\left(\lim_{n\to\infty}g(X_n) = g(X)\right) &\geq \Pr\left(\lim_{n\to\infty}g(X_n) = g(X),\ X\notin D_g\right) \\ &\geq \Pr\left(\lim_{n\to\infty}X_n = X,\ X\notin D_g\right) = 1, \end{align} }[/math]
because the intersection of two almost sure events is almost sure.
By definition, we conclude that g(Xn) converges to g(X) almost surely.
See also
References
- ↑ Mann, H. B.; Wald, A. (1943). "On Stochastic Limit and Order Relationships". Annals of Mathematical Statistics 14 (3): 217–226. doi:10.1214/aoms/1177731415.
- ↑ Amemiya, Takeshi (1985). Advanced Econometrics. Cambridge, MA: Harvard University Press. p. 88. ISBN 0-674-00560-0. https://books.google.com/books?id=0bzGQE14CwEC&pg=pA88.
- ↑ Sargan, Denis (1988). Lectures on Advanced Econometric Theory. Oxford: Basil Blackwell. pp. 4–8. ISBN 0-631-14956-2.
- ↑ Billingsley, Patrick (1969). Convergence of Probability Measures. John Wiley & Sons. p. 31 (Corollary 1). ISBN 0-471-07242-7.
- ↑ van der Vaart, A. W. (1998). Asymptotic Statistics. New York: Cambridge University Press. p. 7 (Theorem 2.3). ISBN 0-521-49603-9. https://books.google.com/books?id=UEuQEM5RjWgC&pg=PA7.
Original source: https://en.wikipedia.org/wiki/Continuous mapping theorem.
Read more |