Cost-Sensitive Variational Autoencoding Classifier for Imbalanced Data Classification

被引:3
|
作者
Liu, Fen [1 ]
Qian, Quan [1 ,2 ,3 ]
机构
[1] Shanghai Univ, Sch Comp Engn & Sci, Shanghai 200444, Peoples R China
[2] Shanghai Univ, Mat Genome Inst, Shanghai 200444, Peoples R China
[3] Zhejiang Lab, Hangzhou 311100, Peoples R China
关键词
variational autoencoder; imbalanced data classification; cost-sensitive learning; MACHINE; SMOTE;
D O I
10.3390/a15050139
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Classification is among the core tasks in machine learning. Existing classification algorithms are typically based on the assumption of at least roughly balanced data classes. When performing tasks involving imbalanced data, such classifiers ignore the minority data in consideration of the overall accuracy. The performance of traditional classification algorithms based on the assumption of balanced data distribution is insufficient because the minority-class samples are often more important than others, such as positive samples, in disease diagnosis. In this study, we propose a cost-sensitive variational autoencoding classifier that combines data-level and algorithm-level methods to solve the problem of imbalanced data classification. Cost-sensitive factors are introduced to assign a high cost to the misclassification of minority data, which biases the classifier toward minority data. We also designed misclassification costs closely related to tasks by embedding domain knowledge. Experimental results show that the proposed method performed the classification of bulk amorphous materials well.
引用
收藏
页数:22
相关论文
共 50 条
  • [21] Improved cost-sensitive representation of data for solving the imbalanced big data classification problem
    Mahboubeh Fattahi
    Mohammad Hossein Moattar
    Yahya Forghani
    Journal of Big Data, 9
  • [22] A Cost-sensitive Ensemble Classifier for Breast Cancer Classification
    Krawczyk, Bartosz
    Schaefer, Gerald
    Wozniak, Michal
    2013 IEEE 8TH INTERNATIONAL SYMPOSIUM ON APPLIED COMPUTATIONAL INTELLIGENCE AND INFORMATICS (SACI 2013), 2013, : 427 - 430
  • [23] Cyclic Classifier Chain for Cost-Sensitive Multilabel Classification
    Lin, Yi-An
    Lin, Hsuan-Tien
    2017 IEEE INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA), 2017, : 11 - 20
  • [24] Analysis of imbalanced data using cost-sensitive learning
    Kim, Sojin
    Song, Jongwoo
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2025,
  • [25] Cost-sensitive learning for imbalanced medical data: a review
    Araf, Imane
    Idri, Ali
    Chairi, Ikram
    ARTIFICIAL INTELLIGENCE REVIEW, 2024, 57 (04)
  • [27] On the Role of Cost-Sensitive Learning in Imbalanced Data Oversampling
    Krawczyk, Bartosz
    Wozniak, Michal
    COMPUTATIONAL SCIENCE - ICCS 2019, PT III, 2019, 11538 : 180 - 191
  • [28] Cost-sensitive learning for imbalanced medical data: a review
    Imane Araf
    Ali Idri
    Ikram Chairi
    Artificial Intelligence Review, 57
  • [29] A hybrid cost-sensitive ensemble for imbalanced breast thermogram classification
    Krawczyk, Bartosz
    Schaefer, Gerald
    Wozniak, Michal
    ARTIFICIAL INTELLIGENCE IN MEDICINE, 2015, 65 (03) : 219 - 227
  • [30] Cost-sensitive decision tree ensembles for effective imbalanced classification
    Krawczyk, Bartosz
    Wozniak, Michal
    Schaefer, Gerald
    APPLIED SOFT COMPUTING, 2014, 14 : 554 - 562