ANALYSIS OF FEATURE SELECTION TECHNIQUES IN CREDIT RISK ASSESSMENT

被引:0
|
作者
Ramya, R. S. [1 ]
Kumaresan, S. [1 ]
机构
[1] Govt Coll Technol, Dept CSE, Coimbatore, Tamil Nadu, India
关键词
Data Mining; Credit risk assessment; Feature selection; Information gain; Gain ratio; Chi square correlation; GENETIC ALGORITHM;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Data Mining is an automated extraction of hidden knowledge from large amount of data. The computational complexity of the data mining algorithms increases rapidly as the number of features in the dataset increases. Real world credit datasets have accumulated large quantities of information about clients and their financial and payment history. Feature selection techniques are used on such high dimensional data to reduce the dimensionality by removing irrelevant and redundant features to improve the predictive accuracy of data mining algorithms. The objective of this work is study the information gain, gain ratio and chi square correlation based feature selection method to reduce the feature dimensionality. Information gain measure identifies the entropy value of each specific feature. The amount of information gain or entropy is used to decide whether the feature is selected or deleted. Gain ratio applies normalization technique to information gain using spilt information value. The correlation based feature selection uses heuristic search strategies to estimate how the features are correlated with the class attribute and how they are important of each other. Experiments were conducted on the German credit dataset available at UCI Machine Learning Repository to reduce the feature dimensionality using these feature selection methods.
引用
收藏
页数:6
相关论文
共 50 条
  • [41] A Path-Based Feature Selection Algorithm for Enterprise Credit Risk Evaluation
    Du, Marui
    Ma, Yue
    Zhang, Zuoquan
    Computational Intelligence and Neuroscience, 2022, 2022
  • [42] A novel framework of credit risk feature selection for SMEs during industry 4.0
    Lu, Yang
    Yang, Lian
    Shi, Baofeng
    Li, Jiaxiang
    Abedin, Mohammad Zoynul
    ANNALS OF OPERATIONS RESEARCH, 2022,
  • [43] A feature selection enabled hybrid-bagging algorithm for credit risk evaluation
    Dahiya, Shashi
    Handa, S. S.
    Singh, N. P.
    EXPERT SYSTEMS, 2017, 34 (06)
  • [44] Credit risk assessment using the factorization machine model with feature interactions
    Quan, Jing
    Sun, Xuelian
    HUMANITIES & SOCIAL SCIENCES COMMUNICATIONS, 2024, 11 (01):
  • [45] Feature Enhanced Ensemble Modeling With Voting Optimization for Credit Risk Assessment
    Yang, Dongqi
    Xiao, Binqing
    IEEE ACCESS, 2024, 12 : 115124 - 115136
  • [46] Evaluation of Feature Selection Techniques for Breast Cancer Risk Prediction
    Lopez, Nahum Cueto
    Garcia-Ordas, Maria Teresa
    Vitelli-Storelli, Facundo
    Fernandez-Navarro, Pablo
    Palazuelos, Camilo
    Alaiz-Rodriguez, Rocio
    INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 2021, 18 (20)
  • [47] Credit Risk Analysis Using Machine Learning Techniques
    Shiv, S. J.
    Murthy, Srinivasa
    Challuru, Krishnaprasad
    2018 FOURTEENTH INTERNATIONAL CONFERENCE ON INFORMATION PROCESSING (ICINPRO) - 2018, 2018, : 214 - 218
  • [48] Analysis of Feature Selection Techniques for Android Malware Detection
    Guyton, Fred
    Li, Wei
    Wang, Ling
    Kumar, Ajoy
    SOUTHEASTCON 2022, 2022, : 96 - 103
  • [49] Analysis of Feature Selection Techniques for Network Traffic Dataset
    Singh, Raman
    Kumar, Harish
    Singla, R. K.
    2013 INTERNATIONAL CONFERENCE ON MACHINE INTELLIGENCE AND RESEARCH ADVANCEMENT (ICMIRA 2013), 2013, : 42 - 46
  • [50] Computerized analysis of mammographic parenchymal patterns for breast cancer risk assessment: Feature selection
    Huo, ZM
    Giger, ML
    Wolverton, DE
    Zhong, WM
    Cumming, S
    Olopade, OI
    MEDICAL PHYSICS, 2000, 27 (01) : 4 - 12