ELMo

From HandWiki
Short description: Word embedding system

ELMo ("Embeddings from Language Model") is a word embedding method for representing a sequence of words as a corresponding sequence of vectors.[1] Character-level tokens are taken as the inputs to a bidirectional LSTM which produces word-level embeddings. Like BERT (but unlike the word embeddings produced by "Bag of Words" approaches, and earlier vector approaches such as Word2Vec and GloVe), ELMo embeddings are context-sensitive, producing different representations for words that share the same spelling but have different meanings (homonyms) such as "bank" in "river bank" and "bank balance".[2]

ELMo's innovation stems from its utilization of bidirectional language models. Unlike their predecessors, these models process language in forward and backwards directions. By considering a word's entire context, bidirectional models capture a more comprehensive understanding of its meaning. This holistic approach to language representation enables ELMo to encode nuanced meanings that might be missed in unidirectional models.[3]

It was created by researchers at the Allen Institute for Artificial Intelligence,[4] and University of Washington and first released in February, 2018.

References

  1. Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018). "Deep contextualized word representations". arXiv:1802.05365 [cs.CL].
  2. "How to use ELMo Embedding in Bidirectional LSTM model architecture?" (in en-US). 2020-02-11. https://www.insofe.edu.in/insights/how-to-use-elmo-embedding-in-bidirectional-lstm-model-architecture/. 
  3. Van Otten, Neri (26 December 2023). "Embeddings from Language Models (ELMo): Contextual Embeddings A Powerful Shift In NLP". https://spotintelligence.com/2023/12/26/embeddings-from-language-models-elmo/. 
  4. "AllenNLP - ELMo — Allen Institute for AI". https://allennlp.org/elmo.