Supervised topic models with weighted words: multi-label document classification

被引:0
|
作者
Yue-peng Zou
Ji-hong Ouyang
Xi-ming Li
机构
[1] Jilin University,College of Computer Science and Technology
[2] Jilin University,MOE Key Laboratory of Symbolic Computation and Knowledge Engineering
关键词
Supervised topic model; Multi-label classification; Class frequency; Labeled latent Dirichlet allocation (L-LDA); Dependency-LDA; TP391;
D O I
暂无
中图分类号
学科分类号
摘要
Supervised topic modeling algorithms have been successfully applied to multi-label document classification tasks. Representative models include labeled latent Dirichlet allocation (L-LDA) and dependency-LDA. However, these models neglect the class frequency information of words (i.e., the number of classes where a word has occurred in the training data), which is significant for classification. To address this, we propose a method, namely the class frequency weight (CF-weight), to weight words by considering the class frequency knowledge. This CF-weight is based on the intuition that a word with higher (lower) class frequency will be less (more) discriminative. In this study, the CF-weight is used to improve L-LDA and dependency-LDA. A number of experiments have been conducted on real-world multi-label datasets. Experimental results demonstrate that CF-weight based algorithms are competitive with the existing supervised topic models.
引用
收藏
页码:513 / 523
页数:10
相关论文
共 50 条
  • [1] Supervised topic models with weighted words:multi-label document classification
    Yue-peng ZOU
    Ji-hong OUYANG
    Xi-ming LI
    FrontiersofInformationTechnology&ElectronicEngineering, 2018, 19 (04) : 513 - 523
  • [2] Supervised topic models with weighted words: multi-label document classification
    Zou, Yue-peng
    Ouyang, Ji-hong
    Li, Xi-ming
    FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2018, 19 (04) : 513 - 523
  • [3] Supervised topic models for multi-label classification
    Li, Ximing
    Ouyang, Jihong
    Zhou, Xiaotang
    NEUROCOMPUTING, 2015, 149 : 811 - 819
  • [4] Statistical topic models for multi-label document classification
    Timothy N. Rubin
    America Chambers
    Padhraic Smyth
    Mark Steyvers
    Machine Learning, 2012, 88 : 157 - 208
  • [5] Statistical topic models for multi-label document classification
    Rubin, Timothy N.
    Chambers, America
    Smyth, Padhraic
    Steyvers, Mark
    MACHINE LEARNING, 2012, 88 (1-2) : 157 - 208
  • [6] Semi-supervised Multi-Label Topic Models for Document Classification and Sentence Labeling
    Soleimani, Hossein
    Miller, David J.
    CIKM'16: PROCEEDINGS OF THE 2016 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2016, : 105 - 114
  • [7] Labelset topic model for multi-label document classification
    Ximing Li
    Jihong Ouyang
    Xiaotang Zhou
    Journal of Intelligent Information Systems, 2016, 46 : 83 - 97
  • [8] Labelset topic model for multi-label document classification
    Li, Ximing
    Ouyang, Jihong
    Zhou, Xiaotang
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2016, 46 (01) : 83 - 97
  • [9] Online multi-label dependency topic models for text classification
    Sophie Burkhardt
    Stefan Kramer
    Machine Learning, 2018, 107 : 859 - 886
  • [10] Online multi-label dependency topic models for text classification
    Burkhardt, Sophie
    Kramer, Stefan
    MACHINE LEARNING, 2018, 107 (05) : 859 - 886