Statistics-based wrapper for feature selection: An implementation on financial distress identification with support vector machine

被引:49
|
作者
Li, Hui [1 ,2 ]
Li, Chang-Jiang [1 ]
Wu, Xian-Jun [3 ]
Sun, Jie [1 ]
机构
[1] Zhejiang Normal Univ, Sch Econ & Management, Jinhua 321004, Zhejiang, Peoples R China
[2] Ohio State Univ, Coll Engn, Columbus, OH 43210 USA
[3] Wuhan Univ Technol, Sch Mech & Elect Engn, Wuhan 430070, Peoples R China
基金
美国国家科学基金会; 中国国家自然科学基金; 国家教育部科学基金资助;
关键词
Financial distress identification (FDI); Support vector machine (SVM); Statistics based feature selection; Wrapper; BANKRUPTCY PREDICTION; NEURAL-NETWORKS; GENETIC ALGORITHMS; MODELS; CLASSIFICATION; PERFORMANCE; PARAMETERS; RATIOS; RISK;
D O I
10.1016/j.asoc.2014.01.018
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Support vector machine (SVM) is an effective tool for financial distress identification (FDI). However, a potential issue that keeps SVM from being efficiently applied in identifying financial distress is how to select features in SVM-based FDI. Although filters are commonly employed, yet this type of approach does not consider predictive capability of SVM itself when selecting features. This research devotes to constructing a statistics-based wrapper for SVM-based FDI by using statistical indices of ranking-order information from predictive performances on various parameters. This wrapper consists of four levels, i.e., data level, model level based on SVM, feature ranking-order level, and the index level of feature selection. When data is ready, predictive accuracies of a type of SVM model, i.e., linear SVM (LSVM), polynomial SVM (PSVM), Gaussian SVM (GSVM), or sigmoid SVM (SSVM), on various pairs of parameters are firstly calculated. Then, performances of SVM models on each candidate feature are transferred to be ranking-order indices. After this step, the two statistical indices of mean and standard deviation values are calculated from ranking-order information on each feature. Finally, the feature selection indices of SVM are produced by a combination of statistical indices. Each feature with its feature selection index being smaller than half of the average index is selected to compose the optimal feature set. With a dataset collected for Chinese FDI prior to 3 years, we statistically verified the performance of this statistics-based wrapper against a non-statistics-based wrapper, two filters, and non-feature selection for SVM-based FDI. Results from unseen dataset indicate that GSVM with the statistics-based wrapper significantly outperformed the other SVM models on the other feature selection methods and two wrapper-based classical statistical models. (C) 2014 Elsevier B.V. All rights reserved.
引用
收藏
页码:57 / 67
页数:11
相关论文
共 50 条
  • [1] Robust statistics-based support vector machine and its variants: a survey
    Manisha Singla
    K. K. Shukla
    Neural Computing and Applications, 2020, 32 : 11173 - 11194
  • [2] Robust statistics-based support vector machine and its variants: a survey
    Singla, Manisha
    Shukla, K. K.
    NEURAL COMPUTING & APPLICATIONS, 2020, 32 (15): : 11173 - 11194
  • [3] Wrapper feature selection embedded Bagging for financial distress prediction
    Wang, G. (wgedison@gmail.com), 1600, ICIC Express Letters Office, Tokai University, Kumamoto Campus, 9-1-1, Toroku, Kumamoto, 862-8652, Japan (04):
  • [4] Wrapper Feature Selection Based on Lightning Attachment Procedure Optimization and Support Vector Machine for Intrusion Detection
    Sun, Shuang
    Ye, Zhiwei
    Yan, Lingyu
    Su, Jun
    Wang, Ruoxi
    PROCEEDINGS OF THE 2018 IEEE 4TH INTERNATIONAL SYMPOSIUM ON WIRELESS SYSTEMS WITHIN THE INTERNATIONAL CONFERENCES ON INTELLIGENT DATA ACQUISITION AND ADVANCED COMPUTING SYSTEMS (IDAACS-SWS), 2018, : 41 - 46
  • [5] Classification of Alzheimer's disease patients with hippocampal shape, wrapper based feature selection and support vector machine
    Young, Jonathan
    Ridgway, Gerard
    Leung, Kelvin
    Ourselin, Sebastien
    MEDICAL IMAGING 2012: IMAGE PROCESSING, 2012, 8314
  • [6] A wrapper method for feature selection using Support Vector Machines
    Maldonado, Sebastian
    Weber, Richard
    INFORMATION SCIENCES, 2009, 179 (13) : 2208 - 2217
  • [7] Support vector machine tree based on feature selection
    Xu, Qinzhen
    Pei, Wenjiang
    Yang, Luxi
    He, Zhenya
    NEURAL INFORMATION PROCESSING, PT 1, PROCEEDINGS, 2006, 4232 : 856 - 863
  • [8] Feature Selection for Support Vector Machine in the Study of Financial Early Warning System
    Li, Jingxiang
    Qin, Yichen
    Yi, Danhui
    Li, Yang
    Shen, Ye
    QUALITY AND RELIABILITY ENGINEERING INTERNATIONAL, 2014, 30 (06) : 867 - 877
  • [9] Predicting Corporate Financial Distress Based on Fuzzy Support Vector Machine
    Yang, Haijun
    Tai, Lei
    2008 4TH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND MOBILE COMPUTING, VOLS 1-31, 2008, : 9811 - 9814
  • [10] Comparison of Embedded and Wrapper Approaches for Feature Selection in Support Vector Machines
    Yamada, Shinichi
    Neshatian, Kourosh
    PRICAI 2019: TRENDS IN ARTIFICIAL INTELLIGENCE, PT II, 2019, 11671 : 149 - 161