Indian buffet process

From HandWiki

In the mathematical theory of probability, the Indian buffet process (IBP) is a stochastic process defining a probability distribution over sparse binary matrices with a finite number of rows and an infinite number of columns. This distribution is suitable to use as a prior for models with potentially infinite number of features. The form of the prior ensures that only a finite number of features will be present in any finite set of observations but more features may appear as more data points are observed.

Indian buffet process prior

Let [math]\displaystyle{ Z }[/math] be an [math]\displaystyle{ N \times K }[/math] binary matrix indicating the presence or absence of a latent feature. The IBP places the following prior on [math]\displaystyle{ Z }[/math]:

[math]\displaystyle{ p(Z) = \frac{\alpha^K}{\prod_{i=1}^N K_1^{(i)}!}\exp\{-\alpha H_N\}\prod_{k=1}^K \frac{(N-m_k)!(m_k-1)!}{N!} }[/math]

where [math]\displaystyle{ K }[/math] is the number of non-zero columns in [math]\displaystyle{ Z }[/math], [math]\displaystyle{ m_k }[/math] is the number of ones in column [math]\displaystyle{ k }[/math] of [math]\displaystyle{ Z }[/math], [math]\displaystyle{ H_N }[/math] is the Nth harmonic number, and [math]\displaystyle{ K_h }[/math] is the number of occurrences of the non-zero binary vector [math]\displaystyle{ h }[/math] among the columns in [math]\displaystyle{ Z }[/math]. The parameter [math]\displaystyle{ \alpha }[/math] controls the expected number of features present in each observation.

In the Indian buffet process, the rows of [math]\displaystyle{ Z }[/math] correspond to customers and the columns correspond to dishes in an infinitely long buffet. The first customer takes the first [math]\displaystyle{ \mathrm{Poisson}(\alpha) }[/math] dishes. The [math]\displaystyle{ i }[/math]-th customer then takes dishes that have been previously sampled with probability [math]\displaystyle{ m_k/i }[/math], where [math]\displaystyle{ m_k }[/math] is the number of people who have already sampled dish [math]\displaystyle{ k }[/math]. He also takes [math]\displaystyle{ \mathrm{Poisson}(\alpha / i) }[/math] new dishes. Therefore, [math]\displaystyle{ z_{nk} }[/math] is one if customer [math]\displaystyle{ n }[/math] tried the [math]\displaystyle{ k }[/math]-th dish and zero otherwise.

This process is infinitely exchangeable for an equivalence class of binary matrices defined by a left-ordered many-to-one function. [math]\displaystyle{ \operatorname{lof}(Z) }[/math] is obtained by ordering the columns of the binary matrix [math]\displaystyle{ Z }[/math] from left to right by the magnitude of the binary number expressed by that column, taking the first row as the most significant bit.

See also