Label-Specific Feature Augmentation for Long-Tailed Multi-Label Text Classification

被引:0
|
作者
Xu, Pengyu [1 ]
Xiao, Lin [1 ]
Liu, Bing [1 ]
Lu, Sijin [1 ]
Jing, Liping [1 ]
Yu, Jian [1 ]
机构
[1] Beijing Jiaotong Univ, Beijing Key Lab Traff Data Anal & Min, Beijing, Peoples R China
基金
北京市自然科学基金; 中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-label text classification (MLTC) involves tagging a document with its most relevant subset of labels from a label set. In real applications, labels usually follow a long-tailed distribution, where most labels (called as tail-label) only contain a small number of documents and limit the performance of MLTC. To facilitate this low-resource problem, researchers introduced a simple but effective strategy, data augmentation (DA). However, most existing DA approaches struggle in multi-label settings. The main reason is that the augmented documents for one label may inevitably influence the other co-occurring labels and further exaggerate the long-tailed problem. To mitigate this issue, we propose a new pair-level augmentation framework for MLTC, called Label-Specific Feature Augmentation (LSFA), which merely augments positive feature-label pairs for the tail-labels. LSFA contains two main parts. The first is for label-specific document representation learning in the high-level latent space, the second is for augmenting tail-label features in latent space by transferring the documents second-order statistics (intra-class semantic variations) from head-labels to tail-labels. At last, we design a new loss function for adjusting classifiers based on augmented datasets. The whole learning procedure can be effectively trained. Comprehensive experiments on benchmark datasets have shown that the proposed LSFA outperforms the state-of-the-art counterparts.
引用
收藏
页码:10602 / 10610
页数:9
相关论文
共 50 条
  • [1] Does Head Label Help for Long-Tailed Multi-Label Text Classification
    Xiao, Lin
    Zhang, Xiangliang
    Jing, Liping
    Huang, Chi
    Song, Mingyang
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 14103 - 14111
  • [2] Granular correlation-based label-specific feature augmentation for multi-label classification
    Zhao, Tianna
    Zhang, Yuanjian
    Miao, Duoqian
    INFORMATION SCIENCES, 2025, 689
  • [3] Label-Specific Document Representation for Multi-Label Text Classification
    Xiao, Lin
    Huang, Xin
    Chen, Boli
    Jing, Liping
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 466 - 475
  • [4] Residual diverse ensemble for long-tailed multi-label text classification
    Jiangxin SHI
    Tong WEI
    Yufeng LI
    Science China(Information Sciences), 2024, 67 (11) : 92 - 105
  • [5] Residual diverse ensemble for long-tailed multi-label text classification
    Shi, Jiangxin
    Wei, Tong
    Li, Yufeng
    SCIENCE CHINA-INFORMATION SCIENCES, 2024, 67 (11)
  • [6] Exploring Contrastive Learning for Long-Tailed Multi-label Text Classification
    Audibert, Alexandre
    Gauffre, Aurelien
    Amini, Massih-Reza
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: RESEARCH TRACK, PT VII, ECML PKDD 2024, 2024, 14947 : 245 - 261
  • [7] Dual Perspective of Label-Specific Feature Learning for Multi-Label Classification
    Hang, Jun-Yi
    Zhang, Min-Ling
    ACM Transactions on Knowledge Discovery from Data, 2024, 19 (01)
  • [8] Multi-Label Classification With Label-Specific Feature Generation: A Wrapped Approach
    Yu, Ze-Bang
    Zhang, Min-Ling
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (09) : 5199 - 5210
  • [9] Dual Perspective of Label-Specific Feature Learning for Multi-Label Classification
    Hang, Jun-Yi
    Zhang, Min-Ling
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [10] Long-tailed Extreme Multi-label Text Classification by the Retrieval of Generated Pseudo Label Descriptions
    Zhang, Ruohong
    Wang, Yau-Shian
    Yang, Yiming
    Yu, Donghan
    Vu, Tom
    Lei, Likun
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 1092 - 1106