Variational Autoencoder Based Synthetic Data Generation for Imbalanced Learning

被引:0
|
作者
Wan, Zhiqiang [1 ]
Zhang, Yazhou [1 ]
He, Haibo [1 ]
机构
[1] Univ Rhode Isl, Dept Elect Comp & Biomed Engn, Kingston, RI 02881 USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Discovering pattern from imbalanced data plays an important role in numerous applications, such as health service, cyber security, and financial engineering. However, the imbalanced data greatly compromise the performance of most learning algorithms. Recently, various synthetic sampling methods have been proposed to balance the dataset. Although these methods have achieved great success in many datasets, they are less effective for high-dimensional data, such as the image. In this paper, we propose a variational autoencoder (VAE) based synthetic data generation method for imbalanced learning. VAE can produce new samples which are similar to those in the original dataset, but not exactly the same. We evaluate and compare our proposed method with the traditional synthetic sampling methods on various datasets under five evaluation metrics. The experimental results demonstrate the effectiveness of the proposed method.
引用
收藏
页码:1500 / 1506
页数:7
相关论文
共 50 条
  • [1] Distributional Learning of Variational AutoEncoder: Application to Synthetic Data Generation
    An, Seunghwan
    Jeon, Jong-June
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [2] Synthetic Data Generation Using Combinatorial Testing and Variational Autoencoder
    Khadka, Krishna
    Chandrasekaran, Jaganmohan
    Lei, Yu
    Kacker, Raghu N.
    Kuhn, D. Richard
    2023 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE TESTING, VERIFICATION AND VALIDATION WORKSHOPS, ICSTW, 2023, : 228 - 236
  • [3] A synthetic data generation system based on the variational-autoencoder technique and the linked data paradigm
    Dos Santos, Ricardo
    Aguilar, Jose
    PROGRESS IN ARTIFICIAL INTELLIGENCE, 2024, 13 (02) : 149 - 163
  • [4] A synthetic neighborhood generation based ensemble learning for the imbalanced data classification
    Chen, Zhi
    Lin, Tao
    Xia, Xin
    Xu, Hongyan
    Ding, Sha
    APPLIED INTELLIGENCE, 2018, 48 (08) : 2441 - 2457
  • [5] A synthetic neighborhood generation based ensemble learning for the imbalanced data classification
    Zhi Chen
    Tao Lin
    Xin Xia
    Hongyan Xu
    Sha Ding
    Applied Intelligence, 2018, 48 : 2441 - 2457
  • [6] KernelADASYN: Kernel Based Adaptive Synthetic Data Generation for Imbalanced Learning
    Tang, Bo
    He, Haibo
    2015 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2015, : 664 - 671
  • [7] ADA-INCVAE: Improved data generation using variational autoencoder for imbalanced classification
    Huang, Kai
    Wang, Xiaoguo
    APPLIED INTELLIGENCE, 2022, 52 (03) : 2838 - 2853
  • [8] ADA-INCVAE: Improved data generation using variational autoencoder for imbalanced classification
    Kai Huang
    Xiaoguo Wang
    Applied Intelligence, 2022, 52 : 2838 - 2853
  • [9] Variational AutoEncoder for synthetic insurance data
    Jamotton, Charlotte
    Hainaut, Donatien
    INTELLIGENT SYSTEMS WITH APPLICATIONS, 2024, 24
  • [10] IMBALANCED DATA CLASSIFICATION BASED ON EXTREME LEARNING MACHINE AUTOENCODER
    Shen, Chu
    Zhang, Su-Fang
    Zhai, Jun-Hal
    Luo, Ding-Sheng
    Chen, Jun-Fen
    PROCEEDINGS OF 2018 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), VOL 2, 2018, : 399 - 404