An Integrated machine learning and DEA-predefined performance outcome prediction framework with high-dimensional imbalanced data

被引:3
|
作者
Shi, Yu [1 ,3 ]
Zhao, Wei [2 ]
机构
[1] Drake Univ, Coll Business & Publ Adm, Des Moines, IA USA
[2] Worcester Polytech Inst, Dept Biomed Engn, Worcester, MA USA
[3] Drake Univ, Coll Business & Publ Adm, Des Moines, IA 50311 USA
关键词
Data envelopment analysis; machine learning; feature selection; performance evaluation; contextual variables; DATA ENVELOPMENT ANALYSIS; BANK BRANCH EFFICIENCY; CREDIT-RISK; BANKRUPTCY PREDICTION; OPERATING EFFICIENCY; FINANCIAL RATIOS; SMOTE; OUTLIERS; MODEL;
D O I
10.1080/03155986.2023.2168943
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In performance evaluation, emerging studies utilize machine learning to increase the interpretability and robustness of data envelopment analysis (DEA), a non-parametric tool for assessing the relative performance of decision-making units (DMUs). In these studies, the machine learning dynamics typically do not replicate the DEA process in terms of directly labeling DMUs based on their relative performance. Practically, there is no standardized methodological framework that serves this purpose. We propose a data-driven and computationally efficient system that imitates DEA and predicts performance outcomes, which are grouped into several classes. First, a DEA composite index was constructed, and the subsequent DEA scores were labeled as the good, the acceptable, and the underperforming classes. Next, synthetic minority oversampling technique (SMOTE) with Manhattan distance metric was used to solve class imbalance in the labeled, high-dimensional dataset. The framework was built using different classifiers, including random forest, support vector machine, and logistic regression, to verify that the framework is not model-dependent. They achieved comparable recall rates (82.70%-95.39%). Moreover, the impacts of contextual variables on DMU performance were unveiled using model-based feature selection and logistic regression. The framework was tested on a banking dataset and an independent dataset containing the electronics, service, and retail industries.
引用
收藏
页码:100 / 129
页数:30
相关论文
共 50 条
  • [1] Discriminative Ridge Machine: A Classifier for High-Dimensional Data or Imbalanced Data
    Peng, Chong
    Cheng, Qiang
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (06) : 2595 - 2609
  • [2] Robust support vector machine for high-dimensional imbalanced data
    Nakayama, Yugo
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2021, 50 (05) : 1524 - 1540
  • [4] Machine Learning Methods for Mortality Prediction of Polytraumatized Patients in Intensive Care Units - Dealing with Imbalanced and High-Dimensional Data
    Moreno Garcia, Maria N.
    Gonzalez Robledo, Javier
    Martin Gonzalez, Felix
    Sanchez Hernandez, Fernando
    Sanchez Barba, Mercedes
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2014, 2014, 8669 : 309 - 317
  • [5] Class prediction for high-dimensional class-imbalanced data
    Blagus, Rok
    Lusa, Lara
    BMC BIOINFORMATICS, 2010, 11 : 523
  • [6] Class prediction for high-dimensional class-imbalanced data
    Rok Blagus
    Lara Lusa
    BMC Bioinformatics, 11
  • [7] Machine learning-based sensitivity of steel frames with highly imbalanced and high-dimensional data
    Koh, Hyeyoung
    Blum, Hannah B.
    Engineering Structures, 2022, 259
  • [8] PERFORMANCE OF MACHINE LEARNING METHODS IN CLASSIFICATION MODELS WITH HIGH-DIMENSIONAL DATA
    Zekic-Susac, Marijana
    Pfeifer, Sanja
    Sarlija, Natasa
    SOR'13 PROCEEDINGS: THE 12TH INTERNATIONAL SYMPOSIUM ON OPERATIONAL RESEARCH IN SLOVENIA, 2013, : 219 - 224
  • [9] Mortality prediction based on imbalanced high-dimensional ICU big data
    Liu, Jiankang
    Chen, Xian Xiang
    Fang, Lipeng
    Li, Jun Xia
    Yang, Ting
    Zhan, Qingyuan
    Tong, Kai
    Fang, Zhen
    COMPUTERS IN INDUSTRY, 2018, 98 : 218 - 225
  • [10] Prediction of vancomycin dose on high-dimensional data using machine learning techniques
    Huang, Xiaohui
    Yu, Ze
    Wei, Xin
    Shi, Junfeng
    Wang, Yu
    Wang, Zeyuan
    Chen, Jihui
    Bu, Shuhong
    Li, Lixia
    Gao, Fei
    Zhang, Jian
    Xu, Ajing
    EXPERT REVIEW OF CLINICAL PHARMACOLOGY, 2021, 14 (06) : 761 - 771