ANALYSIS OF FEATURE SELECTION TECHNIQUES IN CREDIT RISK ASSESSMENT

被引:0
|
作者
Ramya, R. S. [1 ]
Kumaresan, S. [1 ]
机构
[1] Govt Coll Technol, Dept CSE, Coimbatore, Tamil Nadu, India
关键词
Data Mining; Credit risk assessment; Feature selection; Information gain; Gain ratio; Chi square correlation; GENETIC ALGORITHM;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Data Mining is an automated extraction of hidden knowledge from large amount of data. The computational complexity of the data mining algorithms increases rapidly as the number of features in the dataset increases. Real world credit datasets have accumulated large quantities of information about clients and their financial and payment history. Feature selection techniques are used on such high dimensional data to reduce the dimensionality by removing irrelevant and redundant features to improve the predictive accuracy of data mining algorithms. The objective of this work is study the information gain, gain ratio and chi square correlation based feature selection method to reduce the feature dimensionality. Information gain measure identifies the entropy value of each specific feature. The amount of information gain or entropy is used to decide whether the feature is selected or deleted. Gain ratio applies normalization technique to information gain using spilt information value. The correlation based feature selection uses heuristic search strategies to estimate how the features are correlated with the class attribute and how they are important of each other. Experiments were conducted on the German credit dataset available at UCI Machine Learning Repository to reduce the feature dimensionality using these feature selection methods.
引用
收藏
页数:6
相关论文
共 50 条
  • [31] New hybrid method for feature selection and classification using meta-heuristic algorithm in credit risk assessment
    Jalil Nourmohammadi-Khiarak
    Mohammad-Reza Feizi-Derakhshi
    Fatemeh Razeghi
    Samaneh Mazaheri
    Yashar Zamani-Harghalani
    Rohollah Moosavi-Tayebi
    Iran Journal of Computer Science, 2020, 3 (1) : 1 - 11
  • [32] A Bolasso based consistent feature selection enabled random forest classification algorithm: An application to credit risk assessment
    Arora, Nisha
    Kaur, Pankaj Deep
    APPLIED SOFT COMPUTING, 2020, 86 (86)
  • [33] Feature extraction and selection for objective gait analysis and fall risk assessment by accelerometry
    Benoit Caby
    Suzanne Kieffer
    Marie de Saint Hubert
    Gerald Cremer
    Benoit Macq
    BioMedical Engineering OnLine, 10
  • [34] Feature extraction and selection for objective gait analysis and fall risk assessment by accelerometry
    Caby, Benoit
    Kieffer, Suzanne
    de Saint Hubert, Marie
    Cremer, Gerald
    Macq, Benoit
    BIOMEDICAL ENGINEERING ONLINE, 2011, 10
  • [35] Feature Selection in a Credit Scoring Model
    Laborda, Juan
    Ryoo, Seyong
    MATHEMATICS, 2021, 9 (07)
  • [36] Enhancing Credit Scoring Models: Unveiling the Impact of Data Preprocessing and Feature Selection Techniques
    Nalic, Jasmina
    Masetic, Zerina
    Djedovic, Irfan
    2024 23RD INTERNATIONAL SYMPOSIUM INFOTEH-JAHORINA, INFOTEH, 2024,
  • [37] SELECTION OF DEVELOPMENTAL ASSESSMENT TECHNIQUES FOR INFANTS AT RISK
    PARMELEE, AH
    KOPP, CB
    SIGMAN, M
    MERRILL-PALMER QUARTERLY-JOURNAL OF DEVELOPMENTAL PSYCHOLOGY, 1976, 22 (03): : 177 - 199
  • [38] A HYBRID CLUSTERING AND BOOSTING TREE FEATURE SELECTION (CBTFS) METHOD FOR CREDIT RISK
    Zhu, Jianxin
    Wu, Xiong
    Yu, Lean
    Zhang, Xiaoming
    TECHNOLOGICAL AND ECONOMIC DEVELOPMENT OF ECONOMY, 2025,
  • [39] A Path-Based Feature Selection Algorithm for Enterprise Credit Risk Evaluation
    Du, Marui
    Ma, Yue
    Zhang, Zuoquan
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022
  • [40] Credit risk assessment using the factorization machine model with feature interactions
    Jing Quan
    Xuelian Sun
    Humanities and Social Sciences Communications, 11