An Adaptive Markov Model for Text Categorization

被引:1
|
作者
Li, Jin [1 ]
Yue, Kun [2 ]
Liu, Weiyi [2 ]
机构
[1] Yunnan Univ, Sch Software, Kunming, Peoples R China
[2] Yunnan Univ, Sch Informat Sci & Engn, Kunming, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
10.1109/ISKE.2008.4731039
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing methods for text categorization assume that a document is a bag of words. While computationally efficient, such a representation is unable to capture sequential information. In this paper, a document is looked upon as a sequence of characters or words and the preprocessing for text categorization, such as word segmentation and feature selection, is not demanded Statistical dependencies among the neighboring terms of a sequence are captured by different order markov models. We proposed a sequence classification methods based on adaptive markov model. Our method blends the markov models with different order values together for text categorization automatically and effectively. We present an extensive experimental evaluation of our method on an English collections and one Chinese corpus. The results show the high recall and precision of our method.
引用
收藏
页码:802 / +
页数:2
相关论文
共 50 条
  • [1] A text categorization model based on Hidden Markov models
    Yi, K
    Beheshti, J
    CANADIAN JOURNAL OF INFORMATION AND LIBRARY SCIENCE-REVUE CANADIENNE DES SCIENCES DE L INFORMATION ET DE BIBLIOTHECONOMIE, 2003, 27 (03): : 149 - 149
  • [2] Coordinate Model for Text Categorization
    Jiang, Wei
    Chen, Lei
    TRANSACTIONS ON EDUTAINMENT V, 2011, 6530 : 214 - 223
  • [3] Manifold Adaptive Experimental Design for Text Categorization
    Cai, Deng
    He, Xiaofei
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2012, 24 (04) : 707 - 719
  • [4] Text categorization using adaptive context trees
    Vert, JP
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, 2001, 2004 : 423 - 436
  • [5] Smoothing LDA model for text categorization
    Li, Wenbo
    Sun, Le
    Feng, Yuanyong
    Zhang, Dakun
    INFORMATION RETRIEVAL TECHNOLOGY, 2008, 4993 : 83 - +
  • [6] Text Categorization Based on Topic Model
    School of Computer Science and Technology, China University of Mining and Technology, Jiangsu Province, Xuzhou
    221116, China
    不详
    100081, China
    Int. J. Comput. Intell. Syst., 2009, 4 (398-409): : 398 - 409
  • [7] Text Categorization Based on Topic Model
    Zhou, Shibin
    Li, Kan
    Liu, Yushu
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2009, 2 (04) : 398 - 409
  • [8] Text categorization based on topic model
    Zhou, Shibin
    Li, Kan
    Liu, Yushu
    ROUGH SETS AND KNOWLEDGE TECHNOLOGY, 2008, 5009 : 572 - 579
  • [9] Weighted kernel model for text categorization
    Faculty of Information Technology, University of Technology, Sydney, PO Box 123, Broadway NSW 2007, Australia
    Conf. Res. Pract. Inf. Technol. Ser., 2006, (111-114):
  • [10] Hidden Markov Models for Text Categorization in Multi-Page Documents
    Paolo Frasconi
    Giovanni Soda
    Alessandro Vullo
    Journal of Intelligent Information Systems, 2002, 18 : 195 - 217