Manifold Adaptive Experimental Design for Text Categorization

被引:119
|
作者
Cai, Deng [1 ]
He, Xiaofei [1 ]
机构
[1] Zhejiang Univ, Coll Comp Sci, State Key Lab CAD&CG, Hangzhou 310058, Zhejiang, Peoples R China
基金
中国国家自然科学基金;
关键词
Text categorization; active learning; experimental design; manifold learning; kernel method;
D O I
10.1109/TKDE.2011.104
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In many information processing tasks, labels are usually expensive and the unlabeled data points are abundant. To reduce the cost on collecting labels, it is crucial to predict which unlabeled examples are the most informative, i.e., improve the classifier the most if they were labeled. Many active learning techniques have been proposed for text categorization, such as SVMActive and Transductive Experimental Design. However, most of previous approaches try to discover the discriminant structure of the data space, whereas the geometrical structure is not well respected. In this paper, we propose a novel active learning algorithm which is performed in the data manifold adaptive kernel space. The manifold structure is incorporated into the kernel space by using graph Laplacian. This way, the manifold adaptive kernel space reflects the underlying geometry of the data. By minimizing the expected error with respect to the optimal classifier, we can select the most representative and discriminative data points for labeling. Experimental results on text categorization have demonstrated the effectiveness of our proposed approach.
引用
收藏
页码:707 / 719
页数:13
相关论文
共 50 条
  • [21] A Comparative Experimental Assessment of a Threshold Selection Algorithm in Hierarchical Text Categorization
    Addis, Andrea
    Armano, Giuliano
    Vargiu, Eloisa
    ADVANCES IN INFORMATION RETRIEVAL, 2011, 6611 : 32 - 42
  • [22] Manifold Regularized Experimental Design for Active Learning
    Zhang, Lining
    Shum, Hubert P. H.
    Shao, Ling
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2017, 26 (02) : 969 - 981
  • [23] Software design patterns classification and selection using text categorization approach
    Hussain, Shahid
    Keung, Jacky
    Khan, Arif Ali
    APPLIED SOFT COMPUTING, 2017, 58 : 225 - 244
  • [24] Neural Text Categorizer for Exclusive Text Categorization
    Jo, Taeho
    JOURNAL OF INFORMATION PROCESSING SYSTEMS, 2008, 4 (02): : 77 - 86
  • [25] Contextual Text Categorization: An Improved Stemming Algorithm to Increase the Quality of Categorization in Arabic Text
    Gadri, Said
    Moussaoui, Abdelouahab
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2017, 14 (06) : 835 - 841
  • [26] Multiclass Boosting with Adaptive Group-Based kNN and Its Application in Text Categorization
    La, Lei
    Guo, Qiao
    Yang, Dequan
    Cao, Qimin
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2012, 2012
  • [27] Text categorization with WEKA: A survey
    Merlini, Donatella
    Rossini, Martina
    MACHINE LEARNING WITH APPLICATIONS, 2021, 4
  • [28] Web Text Categorization on GBODSS
    Hu, Mingsheng
    Jia, Zhijuan
    ICCSSE 2009: PROCEEDINGS OF 2009 4TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE & EDUCATION, 2009, : 599 - +
  • [29] Comparison of Text Categorization Algorithms
    SHI Yong-feng
    Wuhan University Journal of Natural Sciences, 2004, (05) : 798 - 804
  • [30] Automated Text Document Categorization
    Yasotha, R.
    Charles, E. Y. A.
    2015 IEEE SEVENTH INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND INFORMATION SYSTEMS (ICICIS), 2015, : 522 - 528