Bayesian Optimization Cost-Sensitive XGBoost Learning Algorithm for Imbalanced Data in Semiconductor Industry

被引:0
|
作者
Shamsudin, Haziqah [1 ]
Yusof, Umi Kalsom [1 ]
Kashif, Fizza [1 ]
Isa, Iza Sazanita [1 ,2 ]
机构
[1] Univ Sains Malaysia, Sch Comp Sci, George Town, Malaysia
[2] Univ Teknol MARA, Coll Engn, Ctr Elect Engn Studies, George Town, Malaysia
来源
关键词
XGBoost learning algorithm; Cost-sensitivity; Imbalanced data; Semiconductor classification; Ensembled model; CLASSIFICATION;
D O I
10.5455/jjee.204-1671971895
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper proposes an improved ensemble learning model based on extreme gradient boosting (XGBoost) with Bayesian optimization cost-sensitive learning algorithm for dealing with highly imbalanced data in the semiconductor process to achieve the highest possible pass and fail accuracy or recall for the classification performances. Most of the existing models are biased toward the majority class neglecting the minority class. The proposed Bayesian optimization cost-sensitive XGboost model is configured to be applied to the semiconductor dataset. The obtained experimental results - based on benchmarking semiconductor industry dataset - show 91.46% and 23.08% for the pass and fail accuracies, respectively. This confirms that the proposed model is significant for imbalanced cases in semiconductor applications. Moreover, this investigation reveals that the proposed model is able not only to maintain the performance of the majority class, but also to classify well the minority class.
引用
收藏
页码:552 / 565
页数:14
相关论文
共 50 条
  • [1] Cost-sensitive learning for imbalanced data streams
    Loezer, Lucas
    Enembreck, Fabricio
    Barddal, Jean Paul
    Britto Jr, Alceu de Souza
    PROCEEDINGS OF THE 35TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING (SAC'20), 2020, : 498 - 504
  • [2] Cost-Sensitive Learning Methods for Imbalanced Data
    Nguyen Thai-Nghe
    Gantner, Zeno
    Schmidt-Thieme, Lars
    2010 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS IJCNN 2010, 2010,
  • [3] Analysis of imbalanced data using cost-sensitive learning
    Kim, Sojin
    Song, Jongwoo
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2025,
  • [4] Cost-sensitive learning for imbalanced medical data: a review
    Araf, Imane
    Idri, Ali
    Chairi, Ikram
    ARTIFICIAL INTELLIGENCE REVIEW, 2024, 57 (04)
  • [5] On the Role of Cost-Sensitive Learning in Imbalanced Data Oversampling
    Krawczyk, Bartosz
    Wozniak, Michal
    COMPUTATIONAL SCIENCE - ICCS 2019, PT III, 2019, 11538 : 180 - 191
  • [6] Cost-sensitive learning for imbalanced medical data: a review
    Imane Araf
    Ali Idri
    Ikram Chairi
    Artificial Intelligence Review, 57
  • [7] Cost-Sensitive Learning based on Performance Metric for Imbalanced Data
    Aurelio, Yuri Sousa
    de Almeida, Gustavo Matheus
    de Castro, Cristiano Leite
    Braga, Antonio Padua
    NEURAL PROCESSING LETTERS, 2022, 54 (04) : 3097 - 3114
  • [8] Cost-Sensitive Learning based on Performance Metric for Imbalanced Data
    Yuri Sousa Aurelio
    Gustavo Matheus de Almeida
    Cristiano Leite de Castro
    Antonio Padua Braga
    Neural Processing Letters, 2022, 54 : 3097 - 3114
  • [9] Cost-Sensitive Learning for Imbalanced Bad Debt Datasets in Healthcare Industry
    Shi, Donghui
    Guan, Jian
    Zurada, Jozef
    2015 ASIA-PACIFIC CONFERENCE ON COMPUTER-AIDED SYSTEM ENGINEERING - APCASE 2015, 2015, : 30 - 35
  • [10] Cost-Sensitive Learning of Deep Feature Representations From Imbalanced Data
    Khan, Salman H.
    Hayat, Munawar
    Bennamoun, Mohammed
    Sohel, Ferdous A.
    Togneri, Roberto
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (08) : 3573 - 3587