Modeling Insurance Fraud Detection Using Imbalanced Data Classification

被引:38
|
作者
Hassan, Amira Kamil Ibrahim [1 ,2 ]
Abraham, Ajith [1 ,3 ]
机构
[1] Sudan Univ Sci & Technol, Dept Comp Sci, Khartoum, Sudan
[2] MIR Labs, Auburn, WA USA
[3] VSB Tech Univ Ostrava, IT4Innovat, Ostrava, Czech Republic
关键词
Insurance fraud detection; Imbalanced data; Decision tree; Support vector machine and artificial neural network; AUTOMOBILE INSURANCE; CLAIMS;
D O I
10.1007/978-3-319-27400-3_11
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes an innovative insurance fraud detection method to deal with the imbalanced data distribution. The idea is based on building insurance fraud detection models using Decision tree (DT), Support vector machine (SVM) and Artificial Neural Network (ANN), on data partitions derived from under-sampling (with-replacement and without-replacement) of the majority class and merging it with the minority class. Throughout the paper, ten-fold cross validation method of testing is used. Its originality lies in the use of several partitioning under-sampling approaches and choosing the best. Results from a publicly available automobile insurance fraud detection data set demonstrate that DT performs slightly better than other algorithms, so DT model was used to compare between different partitioning-under-sampling approaches. Empirical results illustrate that the proposed model gave better results.
引用
收藏
页码:117 / 127
页数:11
相关论文
共 50 条
  • [31] Automobile Insurance Fraud Detection using Supervised Classifiers
    Prasasti, Iffa Maula Nur
    Dhini, Arian
    Laoh, Enrico
    2020 5TH INTERNATIONAL WORKSHOP ON BIG DATA AND INFORMATION SECURITY (IWBIS 2020), 2020, : 49 - 53
  • [32] Real Time Credit Card Fraud Detection on Huge Imbalanced Data using Meta-Classifiers
    Kavitha, M.
    Suriakala, M.
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INVENTIVE COMPUTING AND INFORMATICS (ICICI 2017), 2017, : 881 - 887
  • [33] Data Sampling Strategies for Click Fraud Detection Using Imbalanced User Click Data of Online Advertising: An Empirical Review
    Sisodia, Deepti
    Sisodia, Dilip Singh
    IETE TECHNICAL REVIEW, 2022, 39 (04) : 789 - 798
  • [34] An unbalanced data classification model using hybrid sampling technique for fraud detection
    Padmaja, T. Maruthi
    Dhulipalla, Narendra
    Krishna, P. Radha
    Bapi, Raju S.
    Laha, A.
    PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PROCEEDINGS, 2007, 4815 : 341 - +
  • [35] Synthesizing class labels for highly imbalanced credit card fraud detection data
    Kennedy, Robert K. L.
    Villanustre, Flavio
    Khoshgoftaar, Taghi M.
    Salekshahrezaee, Zahra
    JOURNAL OF BIG DATA, 2024, 11 (01)
  • [36] Improved LightGBM for Extremely Imbalanced Data and Application to Credit Card Fraud Detection
    Zhao, Xiaosong
    Liu, Yong
    Zhao, Qiangfu
    IEEE ACCESS, 2024, 12 : 159316 - 159335
  • [37] Imbalanced data classification using MapReduce and relief
    Jedrzejowicz, Joanna
    Kostrzewski, Robert
    Neumann, Jakub
    Zakrzewska, Magdalena
    JOURNAL OF INFORMATION AND TELECOMMUNICATION, 2018, 2 (02) : 217 - 230
  • [38] An efficient fraud detection framework with credit card imbalanced data in financial services
    El-Naby, Aya Abd
    Hemdan, Ezz El-Din
    El-Sayed, Ayman
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (03) : 4139 - 4160
  • [39] Credit Fraud Detection for Extremely Imbalanced Data Based on Ensembled Deep Learning
    Liu Y.
    Yang K.
    1600, Science Press (58): : 539 - 547
  • [40] Tree-Based Cost Sensitive Methods for Fraud Detection in Imbalanced Data
    Metzler, Guillaume
    Badiche, Xavier
    Belkasmi, Brahim
    Fromont, Elisa
    Habrard, Amaury
    Sebban, Marc
    ADVANCES IN INTELLIGENT DATA ANALYSIS XVII, IDA 2018, 2018, 11191 : 213 - 224