Markov chain mixing time

From HandWiki

In probability theory, the mixing time of a Markov chain is the time until the Markov chain is "close" to its steady state distribution. More precisely, a fundamental result about Markov chains is that a finite state irreducible aperiodic chain has a unique stationary distribution π and, regardless of the initial state, the time-t distribution of the chain converges to π as t tends to infinity. Mixing time refers to any of several variant formalizations of the idea: how large must t be until the time-t distribution is approximately π? One variant, total variation distance mixing time, is defined as the smallest t such that the total variation distance of probability measures is small:

[math]\displaystyle{ t_{\mathrm{mix}}(\epsilon) = \min \{ t \geq 0 : \max_{x \in S} \Big[ \max_{A \subseteq S} \big|\Pr(X_t \in A \mid X_0 = x) - \pi(A)\big|\Big] \leq \epsilon \} }[/math].

Choosing a different [math]\displaystyle{ \epsilon }[/math], as long as [math]\displaystyle{ \epsilon \lt 1/2 }[/math], can only change the mixing time up to a constant factor (depending on [math]\displaystyle{ \epsilon }[/math]) and so one often fixes [math]\displaystyle{ \epsilon = 1/4 }[/math] and simply writes [math]\displaystyle{ t_{\mathrm{mix}} }[/math].

This is the sense in which Dave Bayer and Persi Diaconis (1992) proved that the number of riffle shuffles needed to mix an ordinary 52 card deck is 7. Mathematical theory focuses on how mixing times change as a function of the size of the structure underlying the chain. For an [math]\displaystyle{ n }[/math]-card deck, the number of riffle shuffles needed grows as [math]\displaystyle{ 1.5 \log_2 n }[/math]. The most developed theory concerns randomized algorithms for #P-Complete algorithmic counting problems such as the number of graph colorings of a given [math]\displaystyle{ n }[/math] vertex graph. Such problems can, for sufficiently large number of colors, be answered using the Markov chain Monte Carlo method and showing that the mixing time grows only as [math]\displaystyle{ n \log(n) }[/math] (Jerrum 1995). This example and the shuffling example possess the rapid mixing property, that the mixing time grows at most polynomially fast in [math]\displaystyle{ \log }[/math](number of states of the chain). Tools for proving rapid mixing include arguments based on conductance and the method of coupling. In broader uses of the Markov chain Monte Carlo method, rigorous justification of simulation results would require a theoretical bound on mixing time, and many interesting practical cases have resisted such theoretical analysis.

See also