Trigram tagger

From HandWiki
Short description: Statistical method for automatically identifying words by part of speech

In computational linguistics, a trigram tagger is a statistical method for automatically identifying words as being nouns, verbs, adjectives, adverbs, etc. based on second order Markov models that consider triples of consecutive words. It is trained on a text corpus as a method to predict the next word, taking the product of the probabilities of unigram, bigram and trigram. In speech recognition, algorithms utilizing trigram-tagger score better than those algorithms utilizing IIMM tagger but less well than Net tagger.

The description of the trigram tagger is provided by Brants (2000).

References

  • Kempe Andre (1993). "A stochastic Tagger and an Analysis of Tagging Errors". Internal paper. Institute for Computational Linguistics, Universität Stuttgart.
  • Brants, T. (2000) TnT - A Statistical Part-of-Speech Tagger, Proc 6th Applied Natural Language Processing Conference, ANLP-200

External links