Data-centric AI

From HandWiki
Short description: Approach to artificial intelligence emphasizing data quality and management

Data-centric AI is an approach within artificial intelligence that emphasizes on improving the quality, consistency and representativeness of the data used to train machine learning models, rather than focusing primarily on optimizing model architectures or algorithms.[1] This idea has gained traction as researchers and practitioners have come to believe that many performance limitations of machine learning systems stem from issues such as noisy labels, biased datasets, and lack of coverage in the data.[2] Data-centric AI involves disciplined approach to data cleaning, augmentation, labeling, and governance that improves model performance and reliability in applications such as computer vision, natural language processing, and further.[3][4][5][6]

See also

References

  1. Ng, Andrew (2021). "MLOps: From Model-centric to Data-centric AI". https://www.deeplearning.ai/the-batch/data-centric-ai-development-part-2/. 
  2. Sambasivan, Nithya; Kapania, Shubham; Highfill, Hannah; Akrong, Danaë; Paritosh, Praveen; Aroyo, Lora (2021). ""Everyone wants to do the model work, not the data work": Data Cascades in High-Stakes AI". Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. pp. 1–15. doi:10.1145/3411764.3445518. ISBN 978-1-4503-8096-6. 
  3. Zaharia, Matei (2021). "The Rise of Data-centric AI". https://databricks.com/blog/2021/07/19/the-rise-of-data-centric-ai.html. 
  4. Polyzotis, Neoklis; Roy, Sudip; Whang, Steven Euijong; Zinkevich, Martin (2017). "Data Management Challenges in Production Machine Learning". doi:10.1145/3035918.3054782. 
  5. Halevy, Alon; Norvig, Peter; Pereira, Fernando (2009). "The Unreasonable Effectiveness of Data". IEEE Intelligent Systems 24 (2): 8–12. doi:10.1109/MIS.2009.36. Bibcode2009IISys..24b...8H. 
  6. Northcutt, Curtis G.; Jiang, Lu; Chuang, Isaac L. (2021). "Confident Learning: Estimating Uncertainty in Dataset Labels". Journal of Artificial Intelligence Research 70: 1373–1411. doi:10.1613/jair.1.12125.