Organization:Foundation models

From HandWiki
Short description: Artificial intelligence model paradigm

A foundation model is a large artificial intelligence model trained on a vast quantity of unlabeled data at scale (usually by self-supervised learning) resulting in a model that can be adapted to a wide range of downstream tasks.[1] Foundation models are behind a major transformation in how AI systems are built since their introduction in 2018. Early examples of foundation models were large pre-trained language models including BERT[2] and GPT-3. Subsequently, several multimodal foundation models have been produced including DALL-E, Flamingo,[3] and Florence.[4] The Stanford Institute for Human-Centered Artificial Intelligence's (HAI) Center for Research on Foundation Models (CRFM) popularized the term.[1]

Definitions

The Stanford Institute for Human-Centered Artificial Intelligence's (HAI) Center for Research on Foundation Models (CRFM) described a foundation model as a "paradigm for building AI systems" in which a model trained on a large amount of unlabeled data can be adapted to many applications.[5][6]

History

An early concept of a foundation model was found in I. J. Good's 1965 treatise entitled "Speculations Concerning the First Ultraintelligent Machine"[7][8] Stanley Kubrick's HAL 9000 supercomputer in his 1968 2001: A Space Odyssey was modelled after Good's ultraintelligent machine.[9]

Opportunities and risks

A 2021 arXiv report listed foundation models' capabilities in regards to "language, vision, robotics, reasoning, and human interaction", technical principles, such as "model architectures, training procedures, data, systems, security, evaluation, and theory, their applications, for example in law, healthcare, and education and their potential impact on society, including "inequity, misuse, economic and environmental impact, legal and ethical considerations".[10]

References

  1. 1.0 1.1 "Introducing the Center for Research on Foundation Models (CRFM)". Stanford HAI. https://hai.stanford.edu/news/introducing-center-research-foundation-models-crfm. 
  2. Rogers, Anna; Kovaleva, Olga; Rumshisky, Anna (2020). "A Primer in BERTology: What we know about how BERT works". arXiv:2002.12327 [cs.CL].
  3. Tackling multiple tasks with a single visual language model, 28 April 2022, https://www.deepmind.com/blog/tackling-multiple-tasks-with-a-single-visual-language-model, retrieved 13 June 2022 
  4. Yuan, Lu; Chen, Dongdong; Chen, Yi-Ling; Codella, Noel; Dai, Xiyang; Gao, Jianfeng; Hu, Houdong; Huang, Xuedong; Li, Boxin; Li, Chunyuan; Liu, Ce; Liu, Mengchen; Liu, Zicheng; Lu, Yumao; Shi, Yu; Wang, Lijuan; Wang, Jianfeng; Xiao, Bin; Xiao, Zhen; Yang, Jianwei; Zeng, Michael; Zhou, Luowei; Zhang, Pengchuan (2022). "Florence: A New Foundation Model for Computer Vision". arXiv:2111.11432 [cs.CV].
  5. "Stanford CRFM". https://crfm.stanford.edu/. 
  6. "What are foundation models?". IBM Research Blog. https://research.ibm.com/blog/what-are-foundation-models. 
  7. "Huge “foundation models” are turbo-charging AI progress". The Economist. 10 June 2022. ISSN 0013-0613. https://www.economist.com/interactive/briefing/2022/06/11/huge-foundation-models-are-turbo-charging-ai-progress. 
  8. Good, I.J. (1965), Speculations Concerning the First Ultraintelligent Machine, https://exhibits.stanford.edu/feigenbaum/catalog/gz727rg3869 
  9. Dan van der Vat (29 April 2009), ""Jack Good" (obituary)", The Guardian: p. 32, https://www.theguardian.com/science/2009/apr/29/jack-good-codebreaker-obituary, retrieved 9 October 2013 
  10. Bommasani, Rishi; Hudson, Drew A.; Adeli, Ehsan; Altman, Russ; Arora, Simran; von Arx, Sydney; Bernstein, Michael S.; Bohg, Jeannette et al. (18 August 2021). On the Opportunities and Risks of Foundation Models (Report). arXiv. http://arxiv.org/abs/2108.07258. Retrieved 10 June 2022.