Hierarchical Feature Selection Based on Label Distribution Learning

被引:40
|
作者
Lin, Yaojin [1 ]
Liu, Haoyang [1 ]
Zhao, Hong [1 ]
Hu, Qinghua [2 ]
Zhu, Xingquan [3 ]
Wu, Xindong [4 ]
机构
[1] Minnan Normal Univ, Sch Comp Sci, Key Lab Data Sci & Intelligence Applicat, Zhangzhou 363000, Fujian, Peoples R China
[2] Tianjin Univ, Sch Comp Sci, Tianjin 300354, Peoples R China
[3] Florida Atlantic Univ, Dept Elect Engn & Comp Sci, Boca Raton, FL 33431 USA
[4] Hefei Univ Technol, Key Lab Knowledge Engn Big Data, Minist Educ, Hefei 230009, Anhui, Peoples R China
基金
中国国家自然科学基金;
关键词
Feature extraction; Task analysis; Correlation; Electronic mail; Training; Dinosaurs; Computer science; Common and label-specific features; feature selection; hierarchical classification; label distribution learning; label enhancement; CLASSIFICATION;
D O I
10.1109/TKDE.2022.3177246
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Hierarchical classification learning, which organizes data categories into a hierarchical structure, is an effective approach for large-scale classification tasks. The high dimensionality of data feature space, represented in hierarchical class structures, is one of the main research challenges. In addition, the class hierarchy often introduces imbalanced class distributions and causes overfitting. In this paper, we propose a feature selection method based on label distribution learning to address the above challenges. The crux is to alleviate the class imbalance problem and learn a discriminative feature subset for hierarchical classification process. Due to correlation between different class categories in the hierarchical tree structure, sibling categories can provide additional supervisory information for each learning sub tasks, which, in turn, alleviates the problem of under-sampling of minority categories. Therefore, we transform hierarchical labels to a hierarchical label distribution to represent this correlation. After that, a discriminative feature subset is selected recursively, by the common features and label-specific feature constraints, to ensure that downstream classification tasks can achieve the best performance. Experiments and comparisons, using seven well-established feature selection algorithms on six real data sets with different degrees of imbalance, demonstrate the superiority of the proposed method.
引用
收藏
页码:5964 / 5976
页数:13
相关论文
共 50 条
  • [21] Alignment Based Feature Selection for Multi-label Learning
    Linlin Chen
    Degang Chen
    Neural Processing Letters, 2019, 50 : 2323 - 2344
  • [22] Alignment Based Feature Selection for Multi-label Learning
    Chen, Linlin
    Chen, Degang
    NEURAL PROCESSING LETTERS, 2019, 50 (03) : 2323 - 2344
  • [23] Filling Missing Labels in Label Distribution Learning by Exploiting Label-Specific Feature Selection
    Li, Weiwei
    Chen, Jin
    Lu, Yuqing
    Huang, Zhiqiu
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [24] Feature Selection for Hierarchical Multi-label Classification
    da Silva, Luan V. M.
    Cerri, Ricardo
    ADVANCES IN INTELLIGENT DATA ANALYSIS XIX, IDA 2021, 2021, 12695 : 196 - 208
  • [25] Label-correlation-based Common and Specific Feature Selection for Hierarchical Classification
    Lin Y.-J.
    Bai S.-X.
    Zhao H.
    Li S.-Z.
    Hu Q.-H.
    Ruan Jian Xue Bao/Journal of Software, 2022, 33 (07): : 2667 - 2682
  • [26] Label distribution feature selection with feature weights fusion and local label correlations
    Qian, Wenbin
    Ye, Qianzhi
    Li, Yihui
    Dai, Shiming
    Knowledge-Based Systems, 2022, 256
  • [27] Multi-label feature selection based on label distribution and neighborhood rough set
    Liu, Jinghua
    Lin, Yaojin
    Ding, Weiping
    Zhang, Hongbo
    Wang, Cheng
    Du, Jixiang
    NEUROCOMPUTING, 2023, 524 : 142 - 157
  • [28] Label distribution feature selection with feature weights fusion and local label correlations
    Qian, Wenbin
    Ye, Qianzhi
    Li, Yihui
    Dai, Shiming
    KNOWLEDGE-BASED SYSTEMS, 2022, 256
  • [29] Feature selection for label distribution learning using dual-similarity based neighborhood fuzzy entropy
    Deng, Zhixuan
    Li, Tianrui
    Deng, Dayong
    Liu, Keyu
    Zhang, Pengfei
    Zhang, Shiming
    Luo, Zhipeng
    INFORMATION SCIENCES, 2022, 615 : 385 - 404
  • [30] Label disambiguation-based feature selection for partial label learning via fuzzy dependency and feature discernibility
    Qian, Wenbin
    Ding, Jinfei
    Li, Yihui
    Huang, Jintao
    APPLIED SOFT COMPUTING, 2024, 161