Measuring Difficulty of Learning Using Ensemble Methods

被引:0
|
作者
Chen, Bowen [1 ]
Koh, Yun Sing [1 ]
Halstead, Ben [1 ]
机构
[1] Univ Auckland, Sch Comp Sci, Auckland, New Zealand
来源
DATA MINING, AUSDM 2022 | 2022年 / 1741卷
关键词
Complexity measures; Boosting; Instance difficulty; CLASSIFICATION PROBLEMS; COMPLEXITY-MEASURES;
D O I
10.1007/978-981-19-8746-5_3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Measuring the difficulty of each instance is a crucial metaknowledge extraction problem. Most studies on data complexity have focused on extracting the characteristics at a dataset level instead of the instance level while also requiring the complete label knowledge of the dataset, which can often be expensive to obtain. At the instance level, the most commonly used metrics to determine difficult to classify instances are dependant on the learning algorithm used (i.e., uncertainty), and are measurements of the entire system instead of only the dataset. Additionally, these metrics only provide information of misclassification in regard to the learning algorithm and not in respect of the composition of the instances within the dataset. We introduce and propose several novel instance difficulty measures in a semi-supervised boosted ensemble setting to identify difficult to classify instances based on their learning difficulty in relation to other instances within the dataset. The proposed difficulty measures measure both the fluctuations in labeling during the construction process of the ensemble and the amount of resources required for the correct label. This provides the degree of difficulty and gives further insight into the origin of classification difficulty at the instance level reflected by the scores of different difficulty measures.
引用
收藏
页码:28 / 42
页数:15
相关论文
共 50 条
  • [21] Ensemble methods in machine learning
    Dietterich, TG
    MULTIPLE CLASSIFIER SYSTEMS, 2000, 1857 : 1 - 15
  • [22] MEASURING PROSE DIFFICULTY USING RAUDING SCALE
    CARVER, RP
    READING RESEARCH QUARTERLY, 1976, 11 (04) : 660 - 685
  • [23] The Difficulty of Measuring
    Schmidt Nedvedovich, Samuel
    NOESIS-REVISTA DE CIENCIAS SOCIALES Y HUMANIDADES, 2012, 21 (42): : 182 - 210
  • [24] Comprehensive Electric load forecasting using ensemble machine learning methods
    Bhatnagar, Mansi
    Dwivedi, Vivek
    Singh, Divyanshu
    Rozinaj, Gregor
    2022 29TH INTERNATIONAL CONFERENCE ON SYSTEMS, SIGNALS AND IMAGE PROCESSING (IWSSIP), 2022,
  • [25] Predicting total household energy expenditures using ensemble learning methods
    Kesriklioglu, Esma
    Oktay, Erkan
    Karaaslan, Abdulkerim
    ENERGY, 2023, 276
  • [26] An Automated Approach to Diagnose Turner Syndrome Using Ensemble Learning Methods
    Zhao, Qing
    Yao, Guohong
    Akhtar, Faheem
    Li, Jianqiang
    Pei, Yan
    IEEE ACCESS, 2020, 8 : 223335 - 223345
  • [27] MEAN TROPOSPHERIC TEMPERATURE ESTIMATION USING DEEP LEARNING AND ENSEMBLE METHODS
    Brum, Diego
    Rofatto, Vinicius Francisco
    Gonzaga, Luiz, Jr.
    Pena, Rafaela de Oliveira
    Sapucci, Luiz Fernando
    Veronez, Mauricio Roberto
    2022 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2022), 2022, : 6658 - 6661
  • [28] A Machine Learning Model for Predicting Heart Disease using Ensemble Methods
    Samagh J.S.
    Singh D.
    International Journal of Advanced Computer Science and Applications, 2022, 13 (09) : 558 - 565
  • [29] Predicting the cytotoxicity of chemicals using ensemble learning methods and molecular fingerprints
    Yin, Zimo
    Ai, Haixin
    Zhang, Li
    Ren, Guofei
    Wang, Yuming
    Zhao, Qi
    Liu, Hongsheng
    JOURNAL OF APPLIED TOXICOLOGY, 2019, 39 (10) : 1366 - 1377
  • [30] Using machine learning and an ensemble of methods to predict kidney transplant survival
    Mark, Ethan
    Goldsman, David
    Gurbaxani, Brian
    Keskinocak, Pinar
    Sokol, Joel
    PLOS ONE, 2019, 14 (01):