Software:Gretel AI

From HandWiki
Gretel AI
Developer(s)Gretel Labs
Initial releaseMarch 31, 2020; 4 years ago (2020-03-31)
Written inPython
PlatformAmazon Web Services, Microsoft Azure, Google Cloud Platform
LicenseSDK - Apache 2.0, Synthetics - Source-available software

Gretel (also known as Gretel Labs or Gretel AI) is a software startup focused around creating high quality and private Synthetic data. Its primary focus is on generating textual, JSON or tabular data. It accomplishes this using a mix of privacy preservation tools (transformations, differential privacy) in concert with data generation tools (Large language models, and custom Fine-tuning (deep learning)).

Gretel's quality enforcement is accomplished by performing quality checks during data generation, thereby reducing the amount of low quality data in the final dataset.

This type of enforcement can also apply to privacy concerns, by using privacy filters or introducing appropriate levels of noise during data generation.

Gretel's Open Source Datasets

Gretel has released a set of open source datasets (licensed under Apache 2.0) on Hugging Face.[1]

These datasets reflect what can be created using Gretel itself, as well as to allow for use in training models, creating tools, or building other sorts of tools.

Gretel in Research

Gretel's synthetics offering and platform have been referenced in a few research/comparison articles. Examples include:

  • Comparison of Synthetic Data Generation Tools Using Internet of Things Data[2]
  • Gretel.ai: Open-Source Artificial Intelligence Tool To Generate New Synthetic Data[3]
  • Experiments in Reducing NLP Bias and Identifiability for Large LMs[4]
  • Performance Analysis of an Indoor LoRaWAN Network with Field Measurements and AI-Assisted Data Generation [5]

References

  1. "gretelai (Gretel.ai)". 30 October 2024. https://huggingface.co/gretelai. 
  2. M, Gayathri Hegde and Shenoy, P Deepa and R, Venugopal K (2022). "Performance Analysis of Real and Synthetic Data using Supervised ML Algorithms for Prediction of Chronic Kidney Disease". 2022 IEEE International Conference on Electronics, Computing and Communication Technologies (CONECCT). pp. 1–6. doi:10.1109/CONECCT55679.2022.9865722. ISBN 978-1-6654-9781-7. 
  3. "Gretel.ai: Open-Source Artificial Intelligence Tool To Generate New Synthetic Data.". Malaysian Journal of Innovation in Engineering and Applied Social Sciences 1 (1). 2021. http://myjieas.psa.edu.my/index.php/myjieas/article/view/27. Retrieved 9 December 2024. 
  4. "Experiments in Reducing NLP Bias and Identifiability for Large LMs.". TheEyeCorpus. 
  5. "Performance Analysis of an Indoor LoRaWAN Network with Field Measurements and AI-Assisted Data Generation". ICONSAD'23 3rd International Congress on Scientific Advances (Balikesir, Turkey): 1. 2023.