Improved LightGBM for Extremely Imbalanced Data and Application to Credit Card Fraud Detection

被引:1
|
作者
Zhao, Xiaosong [1 ]
Liu, Yong [1 ]
Zhao, Qiangfu [1 ]
机构
[1] Univ Aizu, Grad Sch, Aizu Wakamatsu, Fukushima 9658580, Japan
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Class balancing cost-harmonization LightGBM; cost-sensitive; credit card fraud detection; extremely imbalanced data; interpretability; oversampling; SMOTE; CHALLENGES;
D O I
10.1109/ACCESS.2024.3487212
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Credit card fraud (CCF) is a significant threat to cardholders and financial institutions. CCF detection against this threat is challenging due to extremely imbalanced data (EID). EID involves extremely few instances of fraud for training and an extremely high risk of overlooking fraud. While class balancing or oversampling techniques can address the former problem by punishing negative classes or augmenting the positive data, they do not mitigate the latter. In contrast, the cost-sensitive learning approach targets only the high risk of false negative errors. Therefore, existing approaches are insufficient to solve all the issues of the EID problem. Based on the LightGBM (Light Gradient Boosting Machine) framework, this study introduces two novel machine-learning methods: the class balancing cost-harmonization LightGBM (CB-CHL-LightGBM) and the oversampling cost-harmonization LightGBM (OS-CHL-LightGBM). The new approaches combine class balancing or oversampling technology with LightGBM to solve the EID problem comprehensively. They enhance the efficacy of LightGBM in CCF detection scenarios. Experimental results on three CCF datasets indicate that the two proposed methods outperform LightGBM in several crucial performance metrics. For example, compared with the original LightGBM, CB-CHL-LightGBM or OS-CHL-LightGBM can increase the F2-score from 0.77 to 0.83 for the first dataset, from 0.77 to 0.86 for the second dataset, and from 0.70 to 0.82 for the third dataset. However, adding class balancing, oversampling, and cost-harmonization loss separately to LightGBM may not obtain better results.
引用
收藏
页码:159316 / 159335
页数:20
相关论文
共 50 条
  • [31] Credit Card Fraud Detection: Addressing Imbalanced Datasets with a Multi-phase Approach
    El Hlouli F.Z.
    Riffi J.
    Mahraz M.A.
    Yahyaouy A.
    El Fazazy K.
    Tairi H.
    SN Computer Science, 5 (1)
  • [32] Evolutionary algorithms based on oversampling techniques for enhancing the imbalanced credit card fraud detection
    Korkoman, Malak Jalwi
    Abdullah, Monir
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 44 (06) : 10311 - 10323
  • [33] Credit Card Fraud Detection Using Data Science Technique
    Jayakumar, D.
    Rose, R. Remya
    Kumar, P. Gangula Sudheer
    Vignesh, C. Bhuvan
    Bhupath, A. K.
    INTERNATIONAL JOURNAL OF EARLY CHILDHOOD SPECIAL EDUCATION, 2022, 14 (02) : 7861 - 7866
  • [34] Improving Credit Card Fraud Detection with Data Reduction Approaches
    Wang, Huanjing
    Hancock, John
    Khoshgoftaar, Taghi M.
    INTERNATIONAL JOURNAL OF RELIABILITY QUALITY AND SAFETY ENGINEERING, 2024, 31 (04)
  • [35] Data Mining Solutions for Fraud Detection in Credit Card Payments
    Farooq, Awais
    Selitskiy, Stas
    INTELLIGENT COMPUTING, VOL 1, 2022, 506 : 880 - 888
  • [36] A Comparison of Data Sampling Techniques for Credit Card Fraud Detection
    Muaz, Abdulla
    Jayabalan, Manoj
    Thiruchelvam, Vinesh
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (06) : 477 - 485
  • [37] Credit Card Fraud Detection System
    Filippov, V.
    Mukhanov, L.
    Shchukin, B.
    PROCEEDINGS OF THE 2008 7TH IEEE INTERNATIONAL CONFERENCE ON CYBERNETIC INTELLIGENT SYSTEMS, 2008, : 79 - +
  • [38] Credit Card Fraud Detection Using Improved Deep Learning Models
    Sulaiman, Sumaya S.
    Nadher, Ibraheem
    Hameed, Sarab M.
    CMC-COMPUTERS MATERIALS & CONTINUA, 2024, 78 (01): : 1049 - 1069
  • [39] Effective detection of sophisticated online banking fraud on extremely imbalanced data
    Wei, Wei
    Li, Jinjiu
    Cao, Longbing
    Ou, Yuming
    Chen, Jiahang
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2013, 16 (04): : 449 - 475
  • [40] Effective detection of sophisticated online banking fraud on extremely imbalanced data
    Wei Wei
    Jinjiu Li
    Longbing Cao
    Yuming Ou
    Jiahang Chen
    World Wide Web, 2013, 16 : 449 - 475