Feature Ranking for Hierarchical Multi-Label Classification with Tree Ensemble Methods

被引:7
|
作者
Petkovic, Matej [1 ]
Dzeroski, Saso
Kocev, Dragi
机构
[1] Jozef Stefan Inst, Jamova 39, Ljubljana 1000, Slovenia
关键词
hierarchical multi-label classification; feature ranking; ensemble methods; Relief; RELIEFF;
D O I
10.12700/APH.17.10.2020.10.8
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
In this work, we address the task of feature ranking for hierarchical multi-label classification (HMLC). The task of HMLC concerns problems with multiple binary variables, organized into a hierarchy of target attributes. The goal is to train a model to learn and accurately predict all of them, simultaneously. This task is receiving increasing attention from the research community, due to its wide application potential in text document classification and functional genomics. Here, we propose a group of feature ranking methods based on three established ensemble methods of predictive clustering trees: Bagging, Random Forests and Extra Trees. Predictive clustering trees are a generalization of decision trees, towards predicting structured outputs. Furthermore, we propose to use three scoring functions for calculating the feature importance values: Symbolic, Genie3 and Random Forest. We test the proposed methods on 30 benchmark HMLC datasets, show that Symbolic and Genie3 scores return relevant rankings, that all three scores outperform the HMLC-Relief ranking method and are computed in very time-efficient manner. For each scoring function, we find the most appropriate ensemble method and compare the scores to find the best one.
引用
收藏
页码:129 / 148
页数:20
相关论文
共 50 条
  • [1] Multi-label feature ranking with ensemble methods
    Petkovic, Matej
    Dzeroski, Saso
    Kocev, Dragi
    MACHINE LEARNING, 2020, 109 (11) : 2141 - 2159
  • [2] Multi-label feature ranking with ensemble methods
    Matej Petković
    Sašo Džeroski
    Dragi Kocev
    Machine Learning, 2020, 109 : 2141 - 2159
  • [3] HMC-ReliefF: Feature Ranking for Hierarchical Multi-label Classification
    Slavkov, Ivica
    Karcheska, Jana
    Kocev, Dragi
    Dzeroski, Saso
    COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2018, 15 (01) : 187 - 209
  • [4] Ensemble methods for multi-label classification
    Rokach, Lior
    Schclar, Alon
    Itach, Ehud
    EXPERT SYSTEMS WITH APPLICATIONS, 2014, 41 (16) : 7507 - 7523
  • [5] Feature Selection for Hierarchical Multi-label Classification
    da Silva, Luan V. M.
    Cerri, Ricardo
    ADVANCES IN INTELLIGENT DATA ANALYSIS XIX, IDA 2021, 2021, 12695 : 196 - 208
  • [6] ML-FOREST: A Multi-Label Tree Ensemble Method for Multi-Label Classification
    Wu, Qingyao
    Tan, Mingkui
    Song, Hengjie
    Chen, Jian
    Ng, Michael K.
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2016, 28 (10) : 2665 - 2680
  • [7] Multi-label text classification with an ensemble feature space
    Tandon, Kushagri
    Chatterjee, Niladri
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2022, 42 (05) : 4425 - 4436
  • [8] Multi-label text classification with an ensemble feature space
    Tandon, Kushagri
    Chatterjee, Niladri
    Journal of Intelligent and Fuzzy Systems, 2022, 42 (05): : 4425 - 4436
  • [9] Feature ranking for multi-label classification using Markov networks
    Teisseyre, Pawek
    NEUROCOMPUTING, 2016, 205 : 439 - 454
  • [10] Categorizing feature selection methods for multi-label classification
    Pereira, Rafael B.
    Plastino, Alexandre
    Zadrozny, Bianca
    Merschmann, Luiz H. C.
    ARTIFICIAL INTELLIGENCE REVIEW, 2018, 49 (01) : 57 - 78