Inside–outside algorithm

Short description: Parameter estimation method for probabilistic context-free grammars

For parsing algorithms in computer science, the inside–outside algorithm is a way of re-estimating production probabilities in a probabilistic context-free grammar. It was introduced by James K. Baker in 1979 as a generalization of the forward–backward algorithm for parameter estimation on hidden Markov models to stochastic context-free grammars. It is used to compute expectations, for example as part of the expectation–maximization algorithm (an unsupervised learning algorithm).

Inside and outside probabilities

The inside probability [math]\displaystyle{ \beta_j(p,q) }[/math] is the total probability of generating words [math]\displaystyle{ w_p \cdots w_q }[/math], given the root nonterminal [math]\displaystyle{ N^j }[/math] and a grammar [math]\displaystyle{ G }[/math]:^[1]

[math]\displaystyle{ \beta_j(p,q) = P(w_{pq}|N^j_{pq}, G) }[/math]

The outside probability [math]\displaystyle{ \alpha_j(p,q) }[/math] is the total probability of beginning with the start symbol [math]\displaystyle{ N^1 }[/math] and generating the nonterminal [math]\displaystyle{ N^j_{pq} }[/math] and all the words outside [math]\displaystyle{ w_p \cdots w_q }[/math], given a grammar [math]\displaystyle{ G }[/math]:^[1]

[math]\displaystyle{ \alpha_j(p,q) = P(w_{1(p-1)}, N^j_{pq}, w_{(q+1)m}|G) }[/math]

Computing inside probabilities

Base Case:

[math]\displaystyle{ \beta_j(p,p) = P(w_{p}|N^j, G) }[/math]

General case:

Suppose there is a rule [math]\displaystyle{ N_j \rightarrow N_r N_s }[/math] in the grammar, then the probability of generating [math]\displaystyle{ w_p \cdots w_q }[/math] starting with a subtree rooted at [math]\displaystyle{ N_j }[/math] is:

[math]\displaystyle{ \sum_{k=p}^{k=q-1} P(N_j \rightarrow N_r N_s)\beta_r(p,k) \beta_s(k+1,q) }[/math]

The inside probability [math]\displaystyle{ \beta_j(p,q) }[/math] is just the sum over all such possible rules:

[math]\displaystyle{ \beta_j(p,q) = \sum_{N_r,N_s} \sum_{k=p}^{k=q-1} P(N_j \rightarrow N_r N_s)\beta_r(p,k) \beta_s(k+1,q) }[/math]

Computing outside probabilities

Base Case:

[math]\displaystyle{ \alpha_j(1,n) = \begin{cases} 1 & \mbox{if } j=1 \\ 0 & \mbox{otherwise} \end{cases} }[/math]

Here the start symbol is [math]\displaystyle{ N_1 }[/math].

General case:

Suppose there is a rule [math]\displaystyle{ N_r \rightarrow N_j N_s }[/math] in the grammar that generates [math]\displaystyle{ N_j }[/math]. Then the left contribution of that rule to the outside probability [math]\displaystyle{ \alpha_j(p,q) }[/math] is:

[math]\displaystyle{ \sum_{k=q+1}^{k=n} P(N_r \rightarrow N_j N_s)\alpha_r(p,k) \beta_s(q+1,k) }[/math]

Now suppose there is a rule [math]\displaystyle{ N_r \rightarrow N_s N_j }[/math] in the grammar. Then the right contribution of that rule to the outside probability [math]\displaystyle{ \alpha_j(p,q) }[/math] is:

[math]\displaystyle{ \sum_{k=1}^{k=p-1} P(N_r \rightarrow N_s N_j)\alpha_r(k,q) \beta_s(k,p-1) }[/math]

The outside probability [math]\displaystyle{ \alpha_j(p,q) }[/math] is the sum of the left and right contributions over all such rules:

[math]\displaystyle{ \alpha_j(p,q) = \sum_{N_r,N_s} \sum_{k=q+1}^{k=n} P(N_r \rightarrow N_j N_s)\alpha_r(p,k) \beta_s(q+1,k) + \sum_{N_r,N_s} \sum_{k=1}^{k=p-1} P(N_r \rightarrow N_s N_j)\alpha_r(k,q) \beta_s(k,p-1) }[/math]

References

↑ ^{Jump up to: 1.0} ^1.1 Manning, Christopher D.; Hinrich Schütze (1999). Foundations of Statistical Natural Language Processing. Cambridge, MA, USA: MIT Press. pp. 388–402. ISBN 0-262-13360-1. https://archive.org/details/foundationsstati00mann_118.

J. Baker (1979): Trainable grammars for speech recognition. In J. J. Wolf and D. H. Klatt, editors, Speech communication papers presented at the 97th meeting of the Acoustical Society of America, pages 547–550, Cambridge, MA, June 1979. MIT.
Karim Lari, Steve J. Young (1990): The estimation of stochastic context-free grammars using the inside–outside algorithm. Computer Speech and Language, 4:35–56.
Karim Lari, Steve J. Young (1991): Applications of stochastic context-free grammars using the Inside–Outside algorithm. Computer Speech and Language, 5:237–257.
Fernando Pereira, Yves Schabes (1992): Inside–outside reestimation from partially bracketed corpora. Proceedings of the 30th annual meeting on Association for Computational Linguistics, Association for Computational Linguistics, 128–135.

External links

0.00

(0 votes)

Original source: https://en.wikipedia.org/wiki/Inside–outside algorithm. Read more

[manning-schuetze1999-1] {Jump up to: 1.0} ^1.1 Manning, Christopher D.; Hinrich Schütze (1999). Foundations of Statistical Natural Language Processing. Cambridge, MA, USA: MIT Press. pp. 388–402. ISBN 0-262-13360-1. https://archive.org/details/foundationsstati00mann_118.

[1]

Anonymous

Search

Inside–outside algorithm

Namespaces

More

Page actions

Contents

Inside and outside probabilities

Computing inside probabilities

Computing outside probabilities

References

External links

Navigation

Navigation

Help

Translate

Wiki tools

Wiki tools

Anonymous

Search

Inside–outside algorithm

Inside and outside probabilities

Computing inside probabilities

Computing outside probabilities

References

External links

Navigation

Wiki tools

Page tools

Other projects

Categories