BLOOM (language model)

Short description: Open-access multilingual language model

BigScience Large Open-science Open-access Multilingual Language Model (BLOOM^[1]) is a transformer-based language model. It was created by over 1000 AI researchers to provide a free large language model for everyone who wants to try. Trained on around 176 billion parameters over March through July 2022, it is considered an alternative to OpenAI's GPT-3 trained on 176 billion parameters. BLOOM uses a decoder-only transformer model architecture modified from Megatron-LM GPT-2.

The BLOOM project^[2] was started by a co-founder of Hugging Face. Six main groups of people were involved, including HuggingFace's BigScience team, the Microsoft DeepSpeed team, the NVIDIA Megatron-LM team, the IDRIS/GENCI team, the PyTorch team, and the volunteers in the BigScience Engineering workgroup.

BLOOM was trained using data of 46 natural languages and 13 programming languages. In total, 1.6 TeraByte pre-processed text was converted into 350 Billion unique tokens as BLOOM's training datasets. ^[3]

References

↑ "BigScience Large Open-science Open-access Multilingual Language Model". https://huggingface.co/bigscience/bloom.
↑ "The Technology Behind BLOOM Training". https://huggingface.co/blog/bloom-megatron-deepspeed.
↑ Teven Le Scao; Wang, Thomas; Hesslow, Daniel; Saulnier, Lucile; Bekman, Stas; M Saiful Bari; Biderman, Stella; Elsahar, Hady; Muennighoff, Niklas; Phang, Jason; Press, Ofir; Raffel, Colin; Sanh, Victor; Shen, Sheng; Sutawika, Lintang; Tae, Jaesung; Zheng Xin Yong; Launay, Julien; Beltagy, Iz (2022). "What Language Model to Train if You Have One Million GPU Hours?". arXiv:2210.15424.

0.00

(0 votes)

[1] "BigScience Large Open-science Open-access Multilingual Language Model". https://huggingface.co/bigscience/bloom.

[2] "The Technology Behind BLOOM Training". https://huggingface.co/blog/bloom-megatron-deepspeed.

[3] Teven Le Scao; Wang, Thomas; Hesslow, Daniel; Saulnier, Lucile; Bekman, Stas; M Saiful Bari; Biderman, Stella; Elsahar, Hady; Muennighoff, Niklas; Phang, Jason; Press, Ofir; Raffel, Colin; Sanh, Victor; Shen, Sheng; Sutawika, Lintang; Tae, Jaesung; Zheng Xin Yong; Launay, Julien; Beltagy, Iz (2022). "What Language Model to Train if You Have One Million GPU Hours?". arXiv:2210.15424.

[1]

[2]

[3]

Anonymous

Search

BLOOM (language model)

Namespaces

More

Page actions

References

Navigation

Navigation

Help

Translate

Wiki tools

Wiki tools

Anonymous

Search

BLOOM (language model)

References

Navigation

Wiki tools

Page tools

Other projects

Categories