Measuring the prediction difficulty of individual cases in a dataset using machine learning

被引:0
|
作者
Kwon, Hyunjin [1 ,2 ]
Greenberg, Matthew [3 ]
Josephson, Colin Bruce [4 ,6 ]
Lee, Joon [2 ,5 ,6 ,7 ]
机构
[1] Univ Calgary, Schulich Sch Engn, Dept Biomed Engn, Calgary, AB, Canada
[2] Univ Calgary, Cumming Sch Med, Data Intelligence Hlth Lab, Calgary, AB, Canada
[3] Univ Calgary, Dept Math & Stat, Fac Sci, Calgary, AB, Canada
[4] Univ Calgary, Cumming Sch Med, Dept Clin Neurosci, Calgary, AB, Canada
[5] Univ Calgary, Cumming Sch Med, Dept Cardiac Sci, Calgary, AB, Canada
[6] Univ Calgary, Cumming Sch Med, Dept Community Hlth Sci, Calgary, AB, Canada
[7] Kyung Hee Univ, Sch Med, Dept Prevent Med, Seoul, South Korea
来源
SCIENTIFIC REPORTS | 2024年 / 14卷 / 01期
基金
加拿大自然科学与工程研究理事会;
关键词
D O I
10.1038/s41598-024-61284-z
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Different levels of prediction difficulty are one of the key factors that researchers encounter when applying machine learning to data. Although previous studies have introduced various metrics for assessing the prediction difficulty of individual cases, these metrics require specific dataset preconditions. In this paper, we propose three novel metrics for measuring the prediction difficulty of individual cases using fully-connected feedforward neural networks. The first metric is based on the complexity of the neural network needed to make a correct prediction. The second metric employs a pair of neural networks: one makes a prediction for a given case, and the other predicts whether the prediction made by the first model is likely to be correct. The third metric assesses the variability of the neural network's predictions. We investigated these metrics using a variety of datasets, visualized their values, and compared them to fifteen existing metrics from the literature. The results demonstrate that the proposed case difficulty metrics were better able to differentiate various levels of difficulty than most of the existing metrics and show constant effectiveness across diverse datasets. We expect our metrics will provide researchers with a new perspective on understanding their datasets and applying machine learning in various fields.
引用
收藏
页数:15
相关论文
共 50 条
  • [31] Measuring the difficulty of activities for adaptive learning
    Francisco J. Gallego-Durán
    Rafael Molina-Carmona
    Faraón Llorens-Largo
    Universal Access in the Information Society, 2018, 17 : 335 - 348
  • [32] Measuring the difficulty of activities for adaptive learning
    Gallego-Duran, Francisco J.
    Molina-Carmona, Rafael
    Llorens-Largo, Faran
    UNIVERSAL ACCESS IN THE INFORMATION SOCIETY, 2018, 17 (02) : 335 - 348
  • [33] Measuring the Difficulty of Specific Learning Problems
    Thornton, C.
    Connection Science, 1995, 7 (01)
  • [34] On the Difficulty of DNN Hyperparameter Optimization Using Learning Curve Prediction
    Choi, Daeyoung
    Cho, Hyunghun
    Rhee, Wonjong
    PROCEEDINGS OF TENCON 2018 - 2018 IEEE REGION 10 CONFERENCE, 2018, : 0651 - 0656
  • [35] Prediction of ATFM impact for individual flights: A machine learning approach
    Mas-Pujol, Sergi
    Delgado, Luis
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 252
  • [36] Disease Prediction using Machine Learning
    Dubey, Subham
    Banik, Sreerupa
    Ghosh, Deba
    Dey, Akash
    Das, Rishabh
    Dey, Ipsita
    Chowdhury, Sagarika
    Dey, Prianka
    2024 2nd World Conference on Communication and Computing, WCONF 2024, 2024,
  • [37] Gentrification Prediction Using Machine Learning
    Alejandro, Yesenia
    Palafox, Leon
    ADVANCES IN SOFT COMPUTING, MICAI 2019, 2019, 11835 : 187 - 199
  • [38] Diabetes Prediction Using Machine Learning
    Tian, Stephanie
    Hui, Guanghui
    PROCEEDINGS OF THE 2024 9TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING TECHNOLOGIES, ICMLT 2024, 2024, : 16 - 20
  • [39] RCA Prediction using Machine Learning
    Lalwani, Hiro
    Gupta, Rachit
    Srivastava, Sandeep
    Jayaram, Sahana
    2019 5TH IEEE INTERNATIONAL WIE CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (WIECON-ECE 2019), 2019,
  • [40] Recruitment Prediction using Machine Learning
    Reddy, Jagan Mohan D.
    Regella, Sirisha
    Seelam, Srinivasa Reddy
    PROCEEDINGS OF THE 2020 5TH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND SECURITY (ICCCS-2020), 2020,