Anomaly credit data detection based on enhanced Isolation Forest

被引:5
|
作者
Zhang, Xiaodong [1 ]
Yao, Yuan [1 ]
Lv, Congdong [1 ]
Wang, Tao [2 ]
机构
[1] Nanjing Audit Univ, Sch Informat Engn, Nanjing 211815, Peoples R China
[2] JUSFOUN BIG DATA, Beijing 10000, Peoples R China
基金
国家重点研发计划;
关键词
Credit evaluation; Anomaly detection; Class-imbalance; Cost-sensitive; EasyEnsemble; Isolation forest; SVM;
D O I
10.1007/s00170-022-09251-8
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In view of the real-world problem of falsity and errors credit data, and the performance degradation of the credit evaluation model caused by these problems, we proposed an outlier detection algorithm, which considered two characteristics of class-imbalance and cost-sensitive in credit data. We use an anomaly detection model called EIF to optimize the credit evaluation models. EIF uses the EasyEnsemble algorithm to construct balanced data sets, and train an Isolation Forest model for anomaly detection by the balanced datasets with different disturbances. On the one hand, the balanced dataset ensures that the class-imbalance problem is solved by undersampling, on the other hand, each sub-model learns from the overall minority class samples in order to solve the cost-sensitive problem. Experiments were performed on UCI German dataset, and the test set with fake data was constructed by correlation. Compared with other anomaly detection algorithms in common credit evaluation models, the EIF-optimized model has a higher F1 score and a lower cost-sensitive error rate. In conclusion, the EIF model is effective in enhancing the performance of the credit evaluation model for forged credit datasets.
引用
收藏
页码:185 / 192
页数:8
相关论文
共 50 条
  • [21] OptIForest: Optimal Isolation Forest for Anomaly Detection
    Xiang, Haolong
    Zhang, Xuyun
    Hu, Hongsheng
    Qi, Lianyong
    Dou, Wanchun
    Dras, Mark
    Beheshti, Amin
    Xu, Xiaolong
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 2379 - 2387
  • [22] An anomaly detection approach based on the combination of LSTM autoencoder and isolation forest for multivariate time series data
    Phuong Hanh Tran
    Heuchenne, Cedric
    Thomassey, Sebastien
    DEVELOPMENTS OF ARTIFICIAL INTELLIGENCE TECHNOLOGIES IN COMPUTATION AND ROBOTICS, 2020, 12 : 589 - 596
  • [23] A parallel algorithm for network traffic anomaly detection based on Isolation Forest
    Tao, Xiaoling
    Peng, Yang
    Zhao, Feng
    Zhao, Peichao
    Wang, Yong
    INTERNATIONAL JOURNAL OF DISTRIBUTED SENSOR NETWORKS, 2018, 14 (11)
  • [24] ISOLATION FOREST FOR ANOMALY DETECTION IN HYPERSPECTRAL IMAGES
    Zhang, Kunzhong
    Kang, Xudong
    Li, Shutao
    2019 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2019), 2019, : 437 - 440
  • [25] Anomaly Data Detection of Rolling Element Bearings Vibration Signal Based on Parameter Optimization Isolation Forest
    Wang, Haiming
    Li, Qiang
    Liu, Yongqiang
    Yang, Shaopu
    MACHINES, 2022, 10 (06)
  • [26] A mathematical assessment of the isolation random forest method for anomaly detection in big data
    Morales, Fernando A.
    Ramirez, Jorge M.
    Ramos, Edgar A.
    MATHEMATICAL METHODS IN THE APPLIED SCIENCES, 2023, 46 (01) : 1156 - 1177
  • [27] Improved Anomaly Detection by Using the Attention-Based Isolation Forest
    Utkin, Lev
    Ageev, Andrey
    Konstantinov, Andrei
    Muliukha, Vladimir
    ALGORITHMS, 2023, 16 (01)
  • [28] A Revised Isolation Forest procedure for Anomaly Detection with High Number of Data Points
    Marcelli, Elisa
    Barbariol, Tommaso
    Savarino, Vincenzo
    Beghi, Alessandro
    Susto, Gian Antonio
    2022 23RD IEEE LATIN-AMERICAN TEST SYMPOSIUM (LATS 2022), 2022,
  • [29] Anomaly Detection for Power Consumption Data based on Isolated Forest
    Mao, Wei
    Cao, Xiu
    Zhou, Qinhua
    Yan, Tong
    Zhang, Yongkang
    2018 INTERNATIONAL CONFERENCE ON POWER SYSTEM TECHNOLOGY (POWERCON), 2018, : 4169 - 4174
  • [30] Similarity-Measured Isolation Forest: Anomaly Detection Method for Machine Monitoring Data
    Li, Changgen
    Guo, Liang
    Gao, Hongli
    Li, Yi
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2021, 70 (70)