Feature Ranking for Hierarchical Multi-Label Classification with Tree Ensemble Methods

被引:7
|
作者
Petkovic, Matej [1 ]
Dzeroski, Saso
Kocev, Dragi
机构
[1] Jozef Stefan Inst, Jamova 39, Ljubljana 1000, Slovenia
关键词
hierarchical multi-label classification; feature ranking; ensemble methods; Relief; RELIEFF;
D O I
10.12700/APH.17.10.2020.10.8
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
In this work, we address the task of feature ranking for hierarchical multi-label classification (HMLC). The task of HMLC concerns problems with multiple binary variables, organized into a hierarchy of target attributes. The goal is to train a model to learn and accurately predict all of them, simultaneously. This task is receiving increasing attention from the research community, due to its wide application potential in text document classification and functional genomics. Here, we propose a group of feature ranking methods based on three established ensemble methods of predictive clustering trees: Bagging, Random Forests and Extra Trees. Predictive clustering trees are a generalization of decision trees, towards predicting structured outputs. Furthermore, we propose to use three scoring functions for calculating the feature importance values: Symbolic, Genie3 and Random Forest. We test the proposed methods on 30 benchmark HMLC datasets, show that Symbolic and Genie3 scores return relevant rankings, that all three scores outperform the HMLC-Relief ranking method and are computed in very time-efficient manner. For each scoring function, we find the most appropriate ensemble method and compare the scores to find the best one.
引用
收藏
页码:129 / 148
页数:20
相关论文
共 50 条
  • [21] Advanced Multi-Label Image Classification Techniques Using Ensemble Methods
    Katona, Tamas
    Toth, Gabor
    Petro, Matyas
    Harangi, Balazs
    MACHINE LEARNING AND KNOWLEDGE EXTRACTION, 2024, 6 (02): : 1281 - 1297
  • [22] An Ensemble Embedded Feature Selection Method for Multi-Label Clinical Text Classification
    Guo, Yumeng
    Chung, Fulai
    Li, Guozheng
    2016 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2016, : 823 - 826
  • [23] The importance of the label hierarchy in hierarchical multi-label classification
    Levatic, Jurica
    Kocev, Dragi
    Dzeroski, Saso
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2015, 45 (02) : 247 - 271
  • [24] The importance of the label hierarchy in hierarchical multi-label classification
    Jurica Levatić
    Dragi Kocev
    Sašo Džeroski
    Journal of Intelligent Information Systems, 2015, 45 : 247 - 271
  • [25] Label Correction Strategy on Hierarchical Multi-Label Classification
    Ananpiriyakul, Thanawut
    Poomsirivilai, Piyapan
    Vateekul, Peerapon
    MACHINE LEARNING AND DATA MINING IN PATTERN RECOGNITION, MLDM 2014, 2014, 8556 : 213 - 227
  • [26] Dependency Network Methods for Hierarchical Multi-label Classification of Gene Functions
    Fabris, Fabio
    Freitas, Alex A.
    2014 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DATA MINING (CIDM), 2014, : 241 - 248
  • [27] Multi-label Random Subspace Ensemble Classification
    Bi, Fan
    Zhu, Jianan
    Feng, Yang
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2024,
  • [28] Dynamic ensemble learning for multi-label classification
    Zhu, Xiaoyan
    Li, Jiaxuan
    Ren, Jingtao
    Wang, Jiayin
    Wang, Guangtao
    INFORMATION SCIENCES, 2023, 623 : 94 - 111
  • [29] Independent Feature and Label Components for Multi-label Classification
    Zhong, Yongjian
    Xu, Chang
    Du, Bo
    Zhang, Lefei
    2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2018, : 827 - 836
  • [30] Multi-label Feature Selection Techniques for Hierarchical Multi-label Protein Function Prediction
    Cerri, Ricardo
    Mantovani, Rafael G.
    Basgalupp, Marcio P.
    de Carvalho, Andre C. P. L. F.
    2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,