Kingman's subadditive ergodic theorem

From HandWiki

In mathematics, Kingman's subadditive ergodic theorem is one of several ergodic theorems. It can be seen as a generalization of Birkhoff's ergodic theorem.[1] Intuitively, the subadditive ergodic theorem is a kind of random variable version of Fekete's lemma (hence the name ergodic).[2] As a result, it can be rephrased in the language of probability, e.g. using a sequence of random variables and expected values. The theorem is named after John Kingman.

Statement of theorem

Let [math]\displaystyle{ T }[/math] be a measure-preserving transformation on the probability space [math]\displaystyle{ (\Omega,\Sigma,\mu) }[/math], and let [math]\displaystyle{ \{g_n\}_{n\in\mathbb{N}} }[/math] be a sequence of [math]\displaystyle{ L^1 }[/math] functions such that [math]\displaystyle{ g_{n+m}(x)\le g_n(x)+g_m(T^nx) }[/math] (subadditivity relation). Then

[math]\displaystyle{ \lim_{n\to\infty}\frac{g_n(x)}{n}=:g(x)\ge-\infty }[/math]

for [math]\displaystyle{ \mu }[/math]-a.e. x, where g(x) is T-invariant.

In particular, if T is ergodic, then g(x) is a constant.

Equivalent statement

Given a family of real random variables [math]\displaystyle{ X(m, n) }[/math], with [math]\displaystyle{ 0 \leq m \lt n \in \N }[/math], such that they are subadditive in the sense that[math]\displaystyle{ \begin{aligned} & X(m+1, n+1)=X(m, n) \circ T \\ & X(0, n) \leq X(0, m)+X(m, n) \end{aligned} }[/math]Then there exists a random variable [math]\displaystyle{ Y }[/math] such that [math]\displaystyle{ Y \in [-\infty, +\infty) }[/math], [math]\displaystyle{ Y }[/math] is invariant with respect to [math]\displaystyle{ T }[/math], and [math]\displaystyle{ \lim_n \frac 1n X(0, n) = Y }[/math] a.s..

They are equivalent by setting

  • [math]\displaystyle{ g_n = X(0, n) }[/math] with [math]\displaystyle{ n \geq 1 }[/math];
  • [math]\displaystyle{ X(m, m+n) = g_n \circ T^m }[/math] with [math]\displaystyle{ m \geq 0 }[/math].

Proof

Proof due to (J. Michael Steele, 1989).[3]

Subadditivity by partition

Fix some [math]\displaystyle{ n\geq 1 }[/math]. By subadditivity, for any [math]\displaystyle{ l\in 1:n-1 }[/math] [math]\displaystyle{ g_n \leq g_{n-l} + g_l \circ T^{n-l} }[/math]

We can picture this as starting with the set [math]\displaystyle{ 0:n-1 }[/math], and then removing its length[math]\displaystyle{ l }[/math] tail.

Repeating this construction until the set [math]\displaystyle{ 0:n-1 }[/math] is all gone, we have a one-to-one correspondence between upper bounds of [math]\displaystyle{ g_n }[/math] and partitions of [math]\displaystyle{ 1:n-1 }[/math].

Specifically, let [math]\displaystyle{ \{k_i : (k_i + l_i - 1)\}_i }[/math] be a partition of [math]\displaystyle{ 0:n-1 }[/math], then we have [math]\displaystyle{ g_n \leq \sum_i g_{l_i}\circ T^{k_i} }[/math]

Constructing g

Let [math]\displaystyle{ g := \liminf g_n/n }[/math], then it is [math]\displaystyle{ T }[/math]-invariant.

By subadditivity, [math]\displaystyle{ \frac{g_{n+1}}{n+1} \leq\frac{g_1 + g_n \circ T}{n+1} }[/math]

Taking the [math]\displaystyle{ n\to \infty }[/math] limit, we have [math]\displaystyle{ g \leq g\circ T }[/math] We can visualize [math]\displaystyle{ T }[/math] as hill-climbing on the graph of [math]\displaystyle{ g }[/math]. If [math]\displaystyle{ T }[/math] actually causes a nontrivial amount of hill-climbing, then we would get a spatial contraction, and so [math]\displaystyle{ T }[/math] does not preserve measure. Therefore [math]\displaystyle{ g = g\circ T }[/math] a.e.

Let [math]\displaystyle{ c\in \R }[/math], then [math]\displaystyle{ \{g \geq c\} \subset \{g\circ T \geq c\} = T^{-1}(\{g \geq c\}) }[/math] and since both sides have the same measure, by squeezing, they are equal a.e..

That is, [math]\displaystyle{ g(x) \geq c \iff g(Tx) \geq c }[/math], a.e..

Now apply this for all rational [math]\displaystyle{ c }[/math].

Reducing to the case of gₙ ≤ 0

By subadditivity, using the partition of [math]\displaystyle{ 0:n-1 }[/math] into singletons. [math]\displaystyle{ \begin{aligned} g_1 &\leq g_1 \\ g_2 &\leq g_1 + g_1 \circ T \\ g_3 &\leq g_1 + g_1 \circ T + g_1 \circ T^2 \\ & \cdots \end{aligned} }[/math] Now, construct the sequence [math]\displaystyle{ \begin{aligned} f_1 &= g_1 - g_1 \\ f_2 &= g_2 - (g_1 + g_1 \circ T) \\ f_3 &= g_3 - (g_1 + g_1 \circ T + g_1 \circ T^2) \\ & \cdots \end{aligned} }[/math] which satisfies [math]\displaystyle{ f_n \leq 0 }[/math] for all [math]\displaystyle{ n }[/math].

By the special case, [math]\displaystyle{ f_n/n }[/math] converges a.e. to a [math]\displaystyle{ T }[/math] -invariant function.

By Birkhoff's pointwise ergodic theorem, the running average [math]\displaystyle{ \frac 1n (g_1 + g_1 \circ T + g_1 \circ T^2 + \cdots ) }[/math]converges a.e. to a [math]\displaystyle{ T }[/math] -invariant function. Therefore, their sum does as well.

Bounding the truncation

Fix arbitrary [math]\displaystyle{ \epsilon, M \gt 0 }[/math], and construct the truncated function, still [math]\displaystyle{ T }[/math] -invariant: [math]\displaystyle{ g' := \max(g, -M) }[/math] With these, it suffices to prove an a.e. upper bound[math]\displaystyle{ \limsup g_n/n \leq g' + \epsilon }[/math]since it would allow us to take the limit [math]\displaystyle{ \epsilon = 1/1, 1/2, 1/3, \dots }[/math], then the limit [math]\displaystyle{ M = 1, 2, 3, \dots }[/math], giving us a.e.

[math]\displaystyle{ \limsup g_n/n \leq \liminf g_n/n =: g }[/math]And by squeezing, we have [math]\displaystyle{ g_n/n }[/math] converging a.e. to [math]\displaystyle{ g }[/math]. Define two families of sets, one shrinking to the empty set, and one growing to the full set. For each "length" [math]\displaystyle{ L = 1, 2, 3, \dots }[/math], define[math]\displaystyle{ B_L := \{x : g_l/l \gt g' + \epsilon, \forall l \in 1, 2, \dots, L\} }[/math] [math]\displaystyle{ A_L := B_N^c = \{x : g_l/l \leq g' + \epsilon, \exists l \in 1, 2, \dots, L\} }[/math]Since [math]\displaystyle{ g' \geq \liminf g_n/n }[/math], the [math]\displaystyle{ B }[/math] family shrinks to the empty set.


Fix [math]\displaystyle{ x \in X }[/math]. Fix [math]\displaystyle{ L \in \N }[/math]. Fix [math]\displaystyle{ n \gt N }[/math]. The ordering of these qualifiers is vitally important, because we will be removing the qualifiers one by one in the reverse order.

To prove the a.e. upper bound, we must use the subadditivity, which means we must construct a partion of the set [math]\displaystyle{ 0:n-1 }[/math]. We do this inductively:

Take the smallest [math]\displaystyle{ k }[/math] not already in a partition.

If [math]\displaystyle{ T^k x \in A_N }[/math], then [math]\displaystyle{ g_l(T^k x)/l \leq g'(x) + \epsilon }[/math] for some [math]\displaystyle{ l\in 1, 2, \dots L }[/math]. Take one such [math]\displaystyle{ l }[/math] – the choice does not matter.

If [math]\displaystyle{ k+l-1 \leq n-1 }[/math], then we cut out [math]\displaystyle{ \{k, \dots, k+l-1\} }[/math]. Call these partitions “type 1”. Else, we cut out [math]\displaystyle{ \{k\} }[/math]. Call these partitions “type 2”.

Else, we cut out [math]\displaystyle{ \{k\} }[/math]. Call these partitions “type 3”.

Now convert this partition into an inequality: [math]\displaystyle{ g_n(x) \leq \sum_i g_{l_i}(T^{k_i}x) }[/math] where [math]\displaystyle{ k_i }[/math] are the heads of the partitions, and [math]\displaystyle{ l_i }[/math] are the lengths.

Since all [math]\displaystyle{ g_n \leq 0 }[/math], we can remove the other kinds of partitions: [math]\displaystyle{ g_n(x) \leq \sum_{i: \text{type 1}} g_{l_i}(T^{k_i}x) }[/math] By construction, each [math]\displaystyle{ g_{l_i}(T^{k_i}x) \leq l_i(g'(x) + \epsilon) }[/math], thus [math]\displaystyle{ \frac 1n g_n(x) \leq g'(x) \frac 1n \sum_{i: \text{type 1}} l_i + \epsilon }[/math] Now it would be tempting to continue with [math]\displaystyle{ g'(x) \frac 1n \sum_{i: \text{type 1}} l_i \leq g'(x) }[/math], but unfortunately [math]\displaystyle{ g' \leq 0 }[/math], so the direction is the exact opposite. We must lower bound the sum [math]\displaystyle{ \sum_{i: \text{type 1}} l_i }[/math].

The number of type 3 elements is equal to[math]\displaystyle{ \sum_{k\in 0:n-1} 1_{B_L}(T^k x) }[/math]If a number [math]\displaystyle{ k }[/math] is of type 2, then it must be inside the last [math]\displaystyle{ L-1 }[/math] elements of [math]\displaystyle{ 0:n-1 }[/math]. Thus the number of type 2 elements is at most [math]\displaystyle{ L-1 }[/math]. Together, we have the lower bound:[math]\displaystyle{ \frac 1n \sum_{i: \text{type 1}} l_i \geq 1 - \frac{L-1}{n} - \frac 1n \sum_{k\in 0:n-1} 1_{B_L}(T^k x) }[/math]

Peeling off the first qualifier

Remove the [math]\displaystyle{ n\gt N }[/math] qualifier by taking the [math]\displaystyle{ n\to \infty }[/math] limit.

By Birkhoff's pointwise ergodic theorem, there exists an a.e. pointwise limit[math]\displaystyle{ \lim_n \frac 1n \sum_{k\in 0:n-1} 1_{B_L}(T^k x) \to \bar 1_{B_L}(x) }[/math] satisfying
[math]\displaystyle{ \int \bar 1_{B_L} = \mu(B_L); \quad \bar 1_{B_L}(x) \in [0, 1] }[/math] At the limit, we find that for a.e. [math]\displaystyle{ x\in X, L \in \N }[/math], [math]\displaystyle{ \limsup_n \frac{g_n(x)}{n} \leq g'(x) (1- \bar 1_{B_L}(x) ) + \epsilon }[/math]

Peeling off the second qualifier

Remove the [math]\displaystyle{ L \in \N }[/math] qualifier by taking the [math]\displaystyle{ L\to \infty }[/math] limit.

Since we have [math]\displaystyle{ \int \bar 1_{B_L} = \mu(B_L) \to 0 }[/math]and [math]\displaystyle{ \bar 1_{B_L} \geq \bar 1_{B_{L+1}} \geq \cdots }[/math] as [math]\displaystyle{ 1_{B_{L}} \geq 1_{B_{L+1}} \geq \cdots }[/math], we can apply the same argument used for proving Markov's inequality, to obtain
[math]\displaystyle{ \limsup_n \frac{g_n(x)}{n} \leq g'(x) + \epsilon }[/math] for a.e. [math]\displaystyle{ x\in X }[/math].


In detail, the argument is as follows: since [math]\displaystyle{ \bar 1_{B_L} \geq \bar 1_{B_{L+1}} \geq \cdots \geq 0 }[/math], and [math]\displaystyle{ \int \bar 1_{B_L} \to 0 }[/math], we know that for any small [math]\displaystyle{ \delta, \delta' \gt 0 }[/math], all large enough [math]\displaystyle{ L }[/math] satisfies [math]\displaystyle{ \bar 1_{B_L}(x) \lt \delta }[/math] everywhere except on a set of size [math]\displaystyle{ \geq \delta' }[/math]. Thus,[math]\displaystyle{ \limsup_n \frac{g_n(x)}{n} \leq g'(x)(1-\delta) + \epsilon }[/math]with probability [math]\displaystyle{ \geq 1-\delta' }[/math]. Now take both [math]\displaystyle{ \delta, \delta' \to 0 }[/math].

Applications

Taking [math]\displaystyle{ g_n(x):=\sum_{j=0}^{n-1}f(T^jx) }[/math] recovers Birkhoff's pointwise ergodic theorem.

Taking all [math]\displaystyle{ g_n }[/math] constant functions, we recover the Fekete’s subadditive lemma.

Kingman's subadditive ergodic theorem can be used to prove statements about Lyapunov exponents. It also has applications to percolations and longest increasing subsequence.[4]

Longest increasing subsequence

To study the longest increasing subsequence of a random permutation [math]\displaystyle{ \pi }[/math], we generate it in an equivalent way. A random permutation on [math]\displaystyle{ 1:n }[/math] is equivalently generated by uniformly sampling [math]\displaystyle{ n }[/math] points in a square, then find the longest increasing subsequence of that.

Now, define the Poisson point process with density 1 on [math]\displaystyle{ [0, \infty)^2 }[/math], and define the random variables [math]\displaystyle{ M^*_k }[/math] to be the length of the longest increasing subsequence in the square [math]\displaystyle{ [0, k)^2 }[/math]. Define the measure-preserving transform [math]\displaystyle{ T }[/math] by shifting the plane by [math]\displaystyle{ (-1, -1) }[/math], then chopping off the parts that have fallen out of [math]\displaystyle{ [0, \infty)^2 }[/math].

The process is subadditive, that is, [math]\displaystyle{ M_{k+m}^* \geq M_{k}^* + M_{m}^* \circ T^k }[/math]. To see this, notice that the right side constructs an increasing subsequence first in the square [math]\displaystyle{ [0, k)^2 }[/math], then in the square [math]\displaystyle{ [k, k+m)^2 }[/math], and finally concatenate them together. This produces an increasing subsequence in [math]\displaystyle{ [0, k+m)^2 }[/math], but not necessarily the longest one.

Also, [math]\displaystyle{ T }[/math] is ergodic, so by Kingman's theorem, [math]\displaystyle{ M_k^* /k }[/math] converges to a constant almost surely. Since at the limit, there are [math]\displaystyle{ n = k^2 }[/math] points in the square, we have [math]\displaystyle{ L_n^* / \sqrt n }[/math] converging to a constant almost surely.

References