# Indian buffet process

In the mathematical theory of probability, the Indian buffet process (IBP) is a stochastic process defining a probability distribution over sparse binary matrices with a finite number of rows and an infinite number of columns. This distribution is suitable to use as a prior for models with potentially infinite number of features. The form of the prior ensures that only a finite number of features will be present in any finite set of observations but more features may appear as more data points are observed.

## Indian buffet process prior

Let $\displaystyle{ Z }$ be an $\displaystyle{ N \times K }$ binary matrix indicating the presence or absence of a latent feature. The IBP places the following prior on $\displaystyle{ Z }$:

$\displaystyle{ p(Z) = \frac{\alpha^K}{\prod_{i=1}^N K_1^{(i)}!}\exp\{-\alpha H_N\}\prod_{k=1}^K \frac{(N-m_k)!(m_k-1)!}{N!} }$

where $\displaystyle{ K }$ is the number of non-zero columns in $\displaystyle{ Z }$, $\displaystyle{ m_k }$ is the number of ones in column $\displaystyle{ k }$ of $\displaystyle{ Z }$, $\displaystyle{ H_N }$ is the Nth harmonic number, and $\displaystyle{ K_h }$ is the number of occurrences of the non-zero binary vector $\displaystyle{ h }$ among the columns in $\displaystyle{ Z }$. The parameter $\displaystyle{ \alpha }$ controls the expected number of features present in each observation.

In the Indian buffet process, the rows of $\displaystyle{ Z }$ correspond to customers and the columns correspond to dishes in an infinitely long buffet. The first customer takes the first $\displaystyle{ \mathrm{Poisson}(\alpha) }$ dishes. The $\displaystyle{ i }$-th customer then takes dishes that have been previously sampled with probability $\displaystyle{ m_k/i }$, where $\displaystyle{ m_k }$ is the number of people who have already sampled dish $\displaystyle{ k }$. He also takes $\displaystyle{ \mathrm{Poisson}(\alpha / i) }$ new dishes. Therefore, $\displaystyle{ z_{nk} }$ is one if customer $\displaystyle{ n }$ tried the $\displaystyle{ k }$-th dish and zero otherwise.

This process is infinitely exchangeable for an equivalence class of binary matrices defined by a left-ordered many-to-one function. $\displaystyle{ \operatorname{lof}(Z) }$ is obtained by ordering the columns of the binary matrix $\displaystyle{ Z }$ from left to right by the magnitude of the binary number expressed by that column, taking the first row as the most significant bit.