Coupling from the past

Among Markov chain Monte Carlo (MCMC) algorithms, coupling from the past is a method for sampling from the stationary distribution of a Markov chain. Contrary to many MCMC algorithms, coupling from the past gives in principle a perfect sample from the stationary distribution. It was invented by James Propp and David Wilson in 1996.

The basic idea

Consider a finite state irreducible aperiodic Markov chain [math]\displaystyle{ M }[/math] with state space [math]\displaystyle{ S }[/math] and (unique) stationary distribution [math]\displaystyle{ \pi }[/math] ([math]\displaystyle{ \pi }[/math] is a probability vector). Suppose that we come up with a probability distribution [math]\displaystyle{ \mu }[/math] on the set of maps [math]\displaystyle{ f:S\to S }[/math] with the property that for every fixed [math]\displaystyle{ s\in S }[/math], its image [math]\displaystyle{ f(s) }[/math] is distributed according to the transition probability of [math]\displaystyle{ M }[/math] from state [math]\displaystyle{ s }[/math]. An example of such a probability distribution is the one where [math]\displaystyle{ f(s) }[/math] is independent from [math]\displaystyle{ f(s') }[/math] whenever [math]\displaystyle{ s\ne s' }[/math], but it is often worthwhile to consider other distributions. Now let [math]\displaystyle{ f_j }[/math] for [math]\displaystyle{ j\in\mathbb Z }[/math] be independent samples from [math]\displaystyle{ \mu }[/math].

Suppose that [math]\displaystyle{ x }[/math] is chosen randomly according to [math]\displaystyle{ \pi }[/math] and is independent from the sequence [math]\displaystyle{ f_j }[/math]. (We do not worry for now where this [math]\displaystyle{ x }[/math] is coming from.) Then [math]\displaystyle{ f_{-1}(x) }[/math] is also distributed according to [math]\displaystyle{ \pi }[/math], because [math]\displaystyle{ \pi }[/math] is [math]\displaystyle{ M }[/math]-stationary and our assumption on the law of [math]\displaystyle{ f }[/math]. Define

[math]\displaystyle{ F_j:= f_{-1}\circ f_{-2}\circ\cdots\circ f_{-j}. }[/math]

Then it follows by induction that [math]\displaystyle{ F_j(x) }[/math] is also distributed according to [math]\displaystyle{ \pi }[/math] for every [math]\displaystyle{ j\in\mathbb{N} }[/math]. However, it may happen that for some [math]\displaystyle{ n\in\mathbb{N} }[/math] the image of the map [math]\displaystyle{ F_n }[/math] is a single element of [math]\displaystyle{ S }[/math]. In other words, [math]\displaystyle{ F_n(x)=F_n(y) }[/math] for each [math]\displaystyle{ y\in S }[/math]. Therefore, we do not need to have access to [math]\displaystyle{ x }[/math] in order to compute [math]\displaystyle{ F_n(x) }[/math]. The algorithm then involves finding some [math]\displaystyle{ n\in \mathbb N }[/math] such that [math]\displaystyle{ F_n(S) }[/math] is a singleton, and outputting the element of that singleton. The design of a good distribution [math]\displaystyle{ \mu }[/math] for which the task of finding such an [math]\displaystyle{ n }[/math] and computing [math]\displaystyle{ F_n }[/math] is not too costly is not always obvious, but has been accomplished successfully in several important instances.^[1]

The monotone case

There is a special class of Markov chains in which there are particularly good choices for [math]\displaystyle{ \mu }[/math] and a tool for determining if [math]\displaystyle{ |F_n(S)|=1 }[/math]. (Here [math]\displaystyle{ |\cdot| }[/math] denotes cardinality.) Suppose that [math]\displaystyle{ S }[/math] is a partially ordered set with order [math]\displaystyle{ \le }[/math], which has a unique minimal element [math]\displaystyle{ s_0 }[/math] and a unique maximal element [math]\displaystyle{ s_1 }[/math]; that is, every [math]\displaystyle{ s\in S }[/math] satisfies [math]\displaystyle{ s_0\le s\le s_1 }[/math]. Also, suppose that [math]\displaystyle{ \mu }[/math] may be chosen to be supported on the set of monotone maps [math]\displaystyle{ f:S\to S }[/math]. Then it is easy to see that [math]\displaystyle{ |F_n(S)|=1 }[/math] if and only if [math]\displaystyle{ F_n(s_0)=F_n(s_1) }[/math], since [math]\displaystyle{ F_n }[/math] is monotone. Thus, checking this becomes rather easy. The algorithm can proceed by choosing [math]\displaystyle{ n:=n_0 }[/math] for some constant [math]\displaystyle{ n_0 }[/math], sampling the maps [math]\displaystyle{ f_{-1},\dots,f_{-n} }[/math], and outputting [math]\displaystyle{ F_n(s_0) }[/math] if [math]\displaystyle{ F_n(s_0)=F_n(s_1) }[/math]. If [math]\displaystyle{ F_n(s_0)\ne F_n(s_1) }[/math] the algorithm proceeds by doubling [math]\displaystyle{ n }[/math] and repeating as necessary until an output is obtained. (But the algorithm does not resample the maps [math]\displaystyle{ f_{-j} }[/math] which were already sampled; it uses the previously sampled maps when needed.)

References

↑ "Web Site for Perfectly Random Sampling with Markov Chains". http://www.dbwilson.com/exact/.

Propp, James Gary; Wilson, David Bruce (1996). pp. 223–252.
Propp, James; Wilson, David (1998), "Coupling from the past: a user's guide", Microsurveys in discrete probability (Princeton, NJ, 1997), DIMACS Ser. Discrete Math. Theoret. Comput. Sci., 41, Providence, R.I.: American Mathematical Society, pp. 181–192, doi:10.1090/dimacs/041/09, ISBN 9780821808276

0.00

(0 votes)

Original source: https://en.wikipedia.org/wiki/Coupling from the past. Read more

[1] "Web Site for Perfectly Random Sampling with Markov Chains". http://www.dbwilson.com/exact/.

[1]

Anonymous

Search

Coupling from the past

Namespaces

More

Page actions

The basic idea

The monotone case

References

Navigation

Navigation

Help

Translate

Wiki tools

Wiki tools

Anonymous

Search

Coupling from the past

The basic idea

The monotone case

References

Navigation

Wiki tools

Page tools

Other projects

Categories