Measuring Difficulty of Learning Using Ensemble Methods

被引：0

作者：

Chen, Bowen ^{[1
]}

Koh, Yun Sing ^{[1
]}

Halstead, Ben ^{[1
]}

机构：

[1] Univ Auckland, Sch Comp Sci, Auckland, New Zealand

来源：

DATA MINING, AUSDM 2022 | 2022年 / 1741卷

关键词：

Complexity measures; Boosting; Instance difficulty; CLASSIFICATION PROBLEMS; COMPLEXITY-MEASURES;

D O I：

10.1007/978-981-19-8746-5_3

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Measuring the difficulty of each instance is a crucial metaknowledge extraction problem. Most studies on data complexity have focused on extracting the characteristics at a dataset level instead of the instance level while also requiring the complete label knowledge of the dataset, which can often be expensive to obtain. At the instance level, the most commonly used metrics to determine difficult to classify instances are dependant on the learning algorithm used (i.e., uncertainty), and are measurements of the entire system instead of only the dataset. Additionally, these metrics only provide information of misclassification in regard to the learning algorithm and not in respect of the composition of the instances within the dataset. We introduce and propose several novel instance difficulty measures in a semi-supervised boosted ensemble setting to identify difficult to classify instances based on their learning difficulty in relation to other instances within the dataset. The proposed difficulty measures measure both the fluctuations in labeling during the construction process of the ensemble and the amount of resources required for the correct label. This provides the degree of difficulty and gives further insight into the origin of classification difficulty at the instance level reflected by the scores of different difficulty measures.

引用

页码：28 / 42

页数：15

共 50 条

[21] Ensemble methods in machine learning
Dietterich, TG
MULTIPLE CLASSIFIER SYSTEMS, 2000, 1857 : 1 - 15
[22] MEASURING PROSE DIFFICULTY USING RAUDING SCALE
CARVER, RP
READING RESEARCH QUARTERLY, 1976, 11 (04) : 660 - 685
[23] The Difficulty of Measuring
Schmidt Nedvedovich, Samuel
NOESIS-REVISTA DE CIENCIAS SOCIALES Y HUMANIDADES, 2012, 21 (42): : 182 - 210
[24] Comprehensive Electric load forecasting using ensemble machine learning methods
Bhatnagar, Mansi
Dwivedi, Vivek
Singh, Divyanshu
Rozinaj, Gregor
2022 29TH INTERNATIONAL CONFERENCE ON SYSTEMS, SIGNALS AND IMAGE PROCESSING (IWSSIP), 2022,
[25] Predicting total household energy expenditures using ensemble learning methods
Kesriklioglu, Esma
Oktay, Erkan
Karaaslan, Abdulkerim
ENERGY, 2023, 276
[26] An Automated Approach to Diagnose Turner Syndrome Using Ensemble Learning Methods
Zhao, Qing
Yao, Guohong
Akhtar, Faheem
Li, Jianqiang
Pei, Yan
IEEE ACCESS, 2020, 8 : 223335 - 223345
[27] MEAN TROPOSPHERIC TEMPERATURE ESTIMATION USING DEEP LEARNING AND ENSEMBLE METHODS
Brum, Diego
Rofatto, Vinicius Francisco
Gonzaga, Luiz, Jr.
Pena, Rafaela de Oliveira
Sapucci, Luiz Fernando
Veronez, Mauricio Roberto
2022 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2022), 2022, : 6658 - 6661
[28] A Machine Learning Model for Predicting Heart Disease using Ensemble Methods
Samagh J.S.
Singh D.
International Journal of Advanced Computer Science and Applications, 2022, 13 (09) : 558 - 565
[29] Predicting the cytotoxicity of chemicals using ensemble learning methods and molecular fingerprints
Yin, Zimo
Ai, Haixin
Zhang, Li
Ren, Guofei
Wang, Yuming
Zhao, Qi
Liu, Hongsheng
JOURNAL OF APPLIED TOXICOLOGY, 2019, 39 (10) : 1366 - 1377
[30] Using machine learning and an ensemble of methods to predict kidney transplant survival
Mark, Ethan
Goldsman, David
Gurbaxani, Brian
Keskinocak, Pinar
Sokol, Joel
PLOS ONE, 2019, 14 (01):

← 1 2 3 4 5 →