A Systematic Review of Synthetic Data Generation Techniques Using Generative AI

被引:8
|
作者
Goyal, Mandeep [1 ]
Mahmoud, Qusay H. [1 ]
机构
[1] Ontario Tech Univ, Dept Elect Comp & Software Engn, Oshawa, ON L1G 0C5, Canada
关键词
synthetic data; LLMs; GANs; VAEs; generative AI; neural networks; machine learning;
D O I
10.3390/electronics13173509
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Synthetic data are increasingly being recognized for their potential to address serious real-world challenges in various domains. They provide innovative solutions to combat the data scarcity, privacy concerns, and algorithmic biases commonly used in machine learning applications. Synthetic data preserve all underlying patterns and behaviors of the original dataset while altering the actual content. The methods proposed in the literature to generate synthetic data vary from large language models (LLMs), which are pre-trained on gigantic datasets, to generative adversarial networks (GANs) and variational autoencoders (VAEs). This study provides a systematic review of the various techniques proposed in the literature that can be used to generate synthetic data to identify their limitations and suggest potential future research areas. The findings indicate that while these technologies generate synthetic data of specific data types, they still have some drawbacks, such as computational requirements, training stability, and privacy-preserving measures which limit their real-world usability. Addressing these issues will facilitate the broader adoption of synthetic data generation techniques across various disciplines, thereby advancing machine learning and data-driven solutions.
引用
收藏
页数:38
相关论文
共 50 条
  • [21] Evaluation of a Fintech Sales Synthetic Data Generation Model Using a Generative Adversarial Network
    Lopez, Felipe A.
    Duran-Riveros, Marcia
    Maldonado-Duran, Sebastian
    Ruete, David
    Costa, Giannina
    Coronado-Hernandez, Jairo R.
    Gatica, Gustavo
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS-ICCSA 2024 WORKSHOPS, PT VI, 2024, 14820 : 56 - 70
  • [22] Transforming marketing landscapes: a systematic literature review of generative AI using the TCCM model framework
    Prasanna, Akshara
    Kushwaha, Bijay Prasad
    MANAGEMENT REVIEW QUARTERLY, 2025,
  • [23] A systematic review of privacy-preserving techniques for synthetic tabular health data
    Tobias Hyrup
    Anton D. Lautrup
    Arthur Zimek
    Peter Schneider-Kamp
    Discover Data, 3 (1):
  • [24] Empirical Evaluation on Synthetic Data Generation with Generative Adversarial Network
    Lu, Pei-Hsuan
    Wang, Pang-Chieh
    Yu, Chia-Mu
    PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE, MINING AND SEMANTICS (WIMS 2019), 2019,
  • [25] The ethics of using generative AI for qualitative data analysis
    Davison, Robert M.
    Chughtai, Hameed
    Nielsen, Petter
    Marabelli, Marco
    Iannacci, Federico
    van Offenbeek, Marjolein
    Tarafdar, Monideepa
    Trenz, Manuel
    Techatassanasoontorn, Angsana A.
    Diaz Andrade, Antonio
    Panteli, Niki
    INFORMATION SYSTEMS JOURNAL, 2024, 34 (05) : 1433 - 1439
  • [26] Demonstration of Automation of Network Configuration Generation using Generative AI
    Chakraborty, Supratim
    Chitta, Nithin
    Sundaresan, Rajesh
    2024 20TH INTERNATIONAL CONFERENCE ON NETWORK AND SERVICE MANAGEMENT, CNSM 2024, 2024,
  • [27] Synthetic Image Generation Using Deep Learning: A Systematic Literature Review
    Zulfiqar, Aisha
    Daudpota, Sher Muhammad
    Imran, Ali Shariq
    Kastrati, Zenun
    Ullah, Mohib
    Sadhwani, Suraksha
    COMPUTATIONAL INTELLIGENCE, 2024, 40 (05)
  • [28] Trials of using Generative AI for APB UVM testbench generation
    Dranga, Diana
    ROMANIAN JOURNAL OF INFORMATION TECHNOLOGY AND AUTOMATIC CONTROL-REVISTA ROMANA DE INFORMATICA SI AUTOMATICA, 2024, 34 (02):
  • [29] Synthetic data generation using generative adversarial network for tokamak plasma current quench experiments
    Dave, Bhrugu
    Patel, Sarthak
    Shivani, Rishi
    Purohit, Shishir
    Chaudhury, Bhaskar
    CONTRIBUTIONS TO PLASMA PHYSICS, 2023, 63 (5-6)
  • [30] Efficient ai adaption using synthetic data
    Blank A.
    Baier L.
    Kedilioglu O.
    Zhu X.
    Metzner M.
    Franke J.
    WT Werkstattstechnik, 2021, 111 (10): : 759 - 762