Improved LightGBM for Extremely Imbalanced Data and Application to Credit Card Fraud Detection

被引:1
|
作者
Zhao, Xiaosong [1 ]
Liu, Yong [1 ]
Zhao, Qiangfu [1 ]
机构
[1] Univ Aizu, Grad Sch, Aizu Wakamatsu, Fukushima 9658580, Japan
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Class balancing cost-harmonization LightGBM; cost-sensitive; credit card fraud detection; extremely imbalanced data; interpretability; oversampling; SMOTE; CHALLENGES;
D O I
10.1109/ACCESS.2024.3487212
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Credit card fraud (CCF) is a significant threat to cardholders and financial institutions. CCF detection against this threat is challenging due to extremely imbalanced data (EID). EID involves extremely few instances of fraud for training and an extremely high risk of overlooking fraud. While class balancing or oversampling techniques can address the former problem by punishing negative classes or augmenting the positive data, they do not mitigate the latter. In contrast, the cost-sensitive learning approach targets only the high risk of false negative errors. Therefore, existing approaches are insufficient to solve all the issues of the EID problem. Based on the LightGBM (Light Gradient Boosting Machine) framework, this study introduces two novel machine-learning methods: the class balancing cost-harmonization LightGBM (CB-CHL-LightGBM) and the oversampling cost-harmonization LightGBM (OS-CHL-LightGBM). The new approaches combine class balancing or oversampling technology with LightGBM to solve the EID problem comprehensively. They enhance the efficacy of LightGBM in CCF detection scenarios. Experimental results on three CCF datasets indicate that the two proposed methods outperform LightGBM in several crucial performance metrics. For example, compared with the original LightGBM, CB-CHL-LightGBM or OS-CHL-LightGBM can increase the F2-score from 0.77 to 0.83 for the first dataset, from 0.77 to 0.86 for the second dataset, and from 0.70 to 0.82 for the third dataset. However, adding class balancing, oversampling, and cost-harmonization loss separately to LightGBM may not obtain better results.
引用
收藏
页码:159316 / 159335
页数:20
相关论文
共 50 条
  • [1] AutoEncoder and LightGBM for Credit Card Fraud Detection Problems
    Du, Haichao
    Lv, Li
    Guo, An
    Wang, Hongliang
    SYMMETRY-BASEL, 2023, 15 (04):
  • [2] Enhancing credit card fraud detection: highly imbalanced data case
    Breskuviene, Dalia
    Dzemyda, Gintautas
    JOURNAL OF BIG DATA, 2024, 11 (01)
  • [3] Synthesizing class labels for highly imbalanced credit card fraud detection data
    Kennedy, Robert K. L.
    Villanustre, Flavio
    Khoshgoftaar, Taghi M.
    Salekshahrezaee, Zahra
    JOURNAL OF BIG DATA, 2024, 11 (01)
  • [4] An efficient fraud detection framework with credit card imbalanced data in financial services
    El-Naby, Aya Abd
    Hemdan, Ezz El-Din
    El-Sayed, Ayman
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (03) : 4139 - 4160
  • [5] An efficient fraud detection framework with credit card imbalanced data in financial services
    Aya Abd El-Naby
    Ezz El-Din Hemdan
    Ayman El-Sayed
    Multimedia Tools and Applications, 2023, 82 : 4139 - 4160
  • [6] Synthesizing class labels for highly imbalanced credit card fraud detection data
    Robert K. L. Kennedy
    Flavio Villanustre
    Taghi M. Khoshgoftaar
    Zahra Salekshahrezaee
    Journal of Big Data, 11
  • [7] Credit Fraud Detection for Extremely Imbalanced Data Based on Ensembled Deep Learning
    Liu Y.
    Yang K.
    1600, Science Press (58): : 539 - 547
  • [8] Credit Card Fraud Detection using LightGBM with Asymmetric Error Control
    Hu, Xinyi
    Chen, Haiwen
    Zhang, Ranxin
    2019 SECOND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE FOR INDUSTRIES (AI4I 2019), 2019, : 91 - 94
  • [9] DATA MINING APPLICATION IN CREDIT CARD FRAUD DETECTION SYSTEM
    Ogwueleka, Francisca Nonyelum
    JOURNAL OF ENGINEERING SCIENCE AND TECHNOLOGY, 2011, 6 (03): : 311 - 322
  • [10] Quantum Autoencoder for Enhanced Fraud Detection in Imbalanced Credit Card Dataset
    Huot, Chansreynich
    Heng, Sovanmonynuth
    Kim, Tae-Kyung
    Han, Youngsun
    IEEE ACCESS, 2024, 12 : 169671 - 169682