Data-centric AI

Short description: Approach to artificial intelligence emphasizing data quality and management

Data-centric AI is an approach within artificial intelligence that emphasizes on improving the quality, consistency and representativeness of the data used to train machine learning models, rather than focusing primarily on optimizing model architectures or algorithms.^[1] This idea has gained traction as researchers and practitioners have come to believe that many performance limitations of machine learning systems stem from issues such as noisy labels, biased datasets, and lack of coverage in the data.^[2] Data-centric AI involves disciplined approach to data cleaning, augmentation, labeling, and governance that improves model performance and reliability in applications such as computer vision, natural language processing, and further.^[3]^[4]^[5]^[6]

References

↑ Ng, Andrew (2021). "MLOps: From Model-centric to Data-centric AI". https://www.deeplearning.ai/the-batch/data-centric-ai-development-part-2/.
↑ Sambasivan, Nithya; Kapania, Shubham; Highfill, Hannah; Akrong, Danaë; Paritosh, Praveen; Aroyo, Lora (2021). ""Everyone wants to do the model work, not the data work": Data Cascades in High-Stakes AI". Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. pp. 1–15. doi:10.1145/3411764.3445518. ISBN 978-1-4503-8096-6.
↑ Zaharia, Matei (2021). "The Rise of Data-centric AI". https://databricks.com/blog/2021/07/19/the-rise-of-data-centric-ai.html.
↑ Polyzotis, Neoklis; Roy, Sudip; Whang, Steven Euijong; Zinkevich, Martin (2017). "Data Management Challenges in Production Machine Learning". doi:10.1145/3035918.3054782.
↑ Halevy, Alon; Norvig, Peter; Pereira, Fernando (2009). "The Unreasonable Effectiveness of Data". IEEE Intelligent Systems 24 (2): 8–12. doi:10.1109/MIS.2009.36. Bibcode: 2009IISys..24b...8H.
↑ Northcutt, Curtis G.; Jiang, Lu; Chuang, Isaac L. (2021). "Confident Learning: Estimating Uncertainty in Dataset Labels". Journal of Artificial Intelligence Research 70: 1373–1411. doi:10.1613/jair.1.12125.

0.00

(0 votes)

Original source: https://en.wikipedia.org/wiki/Data-centric AI. Read more

[Ng2021-1] Ng, Andrew (2021). "MLOps: From Model-centric to Data-centric AI". https://www.deeplearning.ai/the-batch/data-centric-ai-development-part-2/.

[Sambasivan2021-2] Sambasivan, Nithya; Kapania, Shubham; Highfill, Hannah; Akrong, Danaë; Paritosh, Praveen; Aroyo, Lora (2021). ""Everyone wants to do the model work, not the data work": Data Cascades in High-Stakes AI". Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. pp. 1–15. doi:10.1145/3411764.3445518. ISBN 978-1-4503-8096-6.

[Zaharia2021-3] Zaharia, Matei (2021). "The Rise of Data-centric AI". https://databricks.com/blog/2021/07/19/the-rise-of-data-centric-ai.html.

[Polyzotis2017-4] Polyzotis, Neoklis; Roy, Sudip; Whang, Steven Euijong; Zinkevich, Martin (2017). "Data Management Challenges in Production Machine Learning". doi:10.1145/3035918.3054782.

[Halevy2009-5] Halevy, Alon; Norvig, Peter; Pereira, Fernando (2009). "The Unreasonable Effectiveness of Data". IEEE Intelligent Systems 24 (2): 8–12. doi:10.1109/MIS.2009.36. Bibcode: 2009IISys..24b...8H.

[Northcutt2021-6] Northcutt, Curtis G.; Jiang, Lu; Chuang, Isaac L. (2021). "Confident Learning: Estimating Uncertainty in Dataset Labels". Journal of Artificial Intelligence Research 70: 1373–1411. doi:10.1613/jair.1.12125.

[1]

[2]

[3]

[4]

[5]

[6]

Anonymous

Search

Data-centric AI

Namespaces

More

Page actions

See also

References

Navigation

Navigation

Resources

Help

googletranslator

Navigation

Wiki tools

Wiki tools

Anonymous

Search

Data-centric AI

See also

References

Navigation

Wiki tools

Page tools

Other projects

Categories