Manifold Adaptive Experimental Design for Text Categorization

被引:119
|
作者
Cai, Deng [1 ]
He, Xiaofei [1 ]
机构
[1] Zhejiang Univ, Coll Comp Sci, State Key Lab CAD&CG, Hangzhou 310058, Zhejiang, Peoples R China
基金
中国国家自然科学基金;
关键词
Text categorization; active learning; experimental design; manifold learning; kernel method;
D O I
10.1109/TKDE.2011.104
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In many information processing tasks, labels are usually expensive and the unlabeled data points are abundant. To reduce the cost on collecting labels, it is crucial to predict which unlabeled examples are the most informative, i.e., improve the classifier the most if they were labeled. Many active learning techniques have been proposed for text categorization, such as SVMActive and Transductive Experimental Design. However, most of previous approaches try to discover the discriminant structure of the data space, whereas the geometrical structure is not well respected. In this paper, we propose a novel active learning algorithm which is performed in the data manifold adaptive kernel space. The manifold structure is incorporated into the kernel space by using graph Laplacian. This way, the manifold adaptive kernel space reflects the underlying geometry of the data. By minimizing the expected error with respect to the optimal classifier, we can select the most representative and discriminative data points for labeling. Experimental results on text categorization have demonstrated the effectiveness of our proposed approach.
引用
收藏
页码:707 / 719
页数:13
相关论文
共 50 条
  • [11] An experimental study of boosting model classifiers for Chinese text categorization
    Geng, YB
    Zhu, GM
    Qiu, JR
    Fan, JL
    Zhang, JC
    DIGITAL LIBRARIES: INTERNATIONAL COLLABORATION AND CROSS-FERTILIZATION, PROCEEDINGS, 2004, 3334 : 270 - 279
  • [12] The Application of Text Categorization Technology in Adaptive Learning System for Interpretation of Figures
    Huang, Weibo
    He, Zhenpeng
    Li, Xiaodan
    ADVANCES IN HARMONY SEARCH, SOFT COMPUTING AND APPLICATIONS, 2020, 1063 : 130 - 138
  • [13] Self-Adaptive Weighting Text Association Categorization Algorithm Research
    Li, Liangjun
    Zhang, Bin
    Che, Yuanyuan
    Yang, Ming
    Li, Tienan
    ACHIEVEMENTS IN ENGINEERING MATERIALS, ENERGY, MANAGEMENT AND CONTROL BASED ON INFORMATION TECHNOLOGY, PTS 1 AND 2, 2011, 171-172 : 246 - +
  • [14] Design of Chinese Text Categorization Classifier Based on Attribute Bagging
    Zhang, Xiang
    Zhou, Mingquan
    Dong, Lili
    Ye, Na
    2009 INTERNATIONAL CONFERENCE ON BUSINESS INTELLIGENCE AND FINANCIAL ENGINEERING, PROCEEDINGS, 2009, : 201 - 204
  • [15] Text Categorization: Implementation
    Jo, Taeho
    Studies in Big Data, 2019, 45 : 129 - 156
  • [16] Noisy text categorization
    Vinciarelli, A
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2005, 27 (12) : 1882 - 1895
  • [17] Noisy text categorization
    Vinciarelli, A
    PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 2, 2004, : 554 - 557
  • [18] Text categorization with ILA
    Sever, H
    Gorur, A
    Tolun, MR
    COMPUTER AND INFORMATION SCIENCES - ISCIS 2003, 2003, 2869 : 300 - 307
  • [19] Automated Text Categorization
    Patel, Atul
    Pathak, Samprati
    Khan, Md Irfan
    ICSPC'21: 2021 3RD INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATION (ICPSC), 2021, : 16 - 20
  • [20] Experimental Evaluation of CNN Parameters for Text Categorization in Legal Document Review
    Han, Qian
    Kou, Yufeng
    Snaidauf, Derek
    2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, : 4320 - 4324