Manifold Adaptive Experimental Design for Text Categorization

被引:119
|
作者
Cai, Deng [1 ]
He, Xiaofei [1 ]
机构
[1] Zhejiang Univ, Coll Comp Sci, State Key Lab CAD&CG, Hangzhou 310058, Zhejiang, Peoples R China
基金
中国国家自然科学基金;
关键词
Text categorization; active learning; experimental design; manifold learning; kernel method;
D O I
10.1109/TKDE.2011.104
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In many information processing tasks, labels are usually expensive and the unlabeled data points are abundant. To reduce the cost on collecting labels, it is crucial to predict which unlabeled examples are the most informative, i.e., improve the classifier the most if they were labeled. Many active learning techniques have been proposed for text categorization, such as SVMActive and Transductive Experimental Design. However, most of previous approaches try to discover the discriminant structure of the data space, whereas the geometrical structure is not well respected. In this paper, we propose a novel active learning algorithm which is performed in the data manifold adaptive kernel space. The manifold structure is incorporated into the kernel space by using graph Laplacian. This way, the manifold adaptive kernel space reflects the underlying geometry of the data. By minimizing the expected error with respect to the optimal classifier, we can select the most representative and discriminative data points for labeling. Experimental results on text categorization have demonstrated the effectiveness of our proposed approach.
引用
收藏
页码:707 / 719
页数:13
相关论文
共 50 条
  • [1] Performing text categorization on manifold
    Wen, Guihua
    Chen, Gan
    Jiang, Lijun
    2006 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS, VOLS 1-6, PROCEEDINGS, 2006, : 3872 - +
  • [2] An Adaptive Markov Model for Text Categorization
    Li, Jin
    Yue, Kun
    Liu, Weiyi
    2008 3RD INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEM AND KNOWLEDGE ENGINEERING, VOLS 1 AND 2, 2008, : 802 - +
  • [3] Text categorization using adaptive context trees
    Vert, JP
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, 2001, 2004 : 423 - 436
  • [4] Effective Categorization of Text in Practical Design
    Ravi, S.
    Sambath, M.
    RameshKumar, K.
    2014 INTERNATIONAL CONFERENCE ON INFORMATION COMMUNICATION AND EMBEDDED SYSTEMS (ICICES), 2014,
  • [5] Design and implementation of a fast text categorization algorithm
    School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
    不详
    Beijing Ligong Daxue Xuebao, 2006, 12 (1069-1072):
  • [6] The design and implementation of an excellent text categorization system
    Lu, MY
    Diao, LL
    Lu, YC
    Zhou, LZ
    PROCEEDINGS OF THE 4TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-4, 2002, : 459 - 463
  • [7] Experimental study on representing units in Chinese text categorization
    Li, BL
    Chen, YZ
    Bai, XJ
    Yu, SW
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, PROCEEDINGS, 2003, 2588 : 602 - 614
  • [8] A Manifold Operator Representation for Adaptive Design
    Beal, Jacob
    Mostafa, Hala
    Mozeika, Annan
    Axelrod, Benjamin
    Adler, Aaron
    Markiewicz, Gretchen
    PROCEEDINGS OF THE FOURTEENTH INTERNATIONAL CONFERENCE ON GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, 2012, : 529 - 536
  • [9] Classification decision combination for text categorization: An experimental study
    Bi, YX
    Bell, D
    Wang, H
    Guo, GD
    Dubitzky, W
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 2004, 3180 : 222 - 231
  • [10] Sequential classifiers combination for text categorization: An experimental study
    Zhang, Z
    Zhou, SG
    Zhou, AY
    ADVANCES IN WEB-AGE INFORMATION MANAGEMENT: PROCEEDINGS, 2004, 3129 : 509 - 518