SVM based Chinese web page automatic classification

被引:4
|
作者
Liang, JZ [1 ]
机构
[1] Zhejiang Normal Univ, Inst Comp Sci, Jinhua 321004, Peoples R China
关键词
support vector machine; statistic learning; web page; text classification; pattern recognition;
D O I
10.1109/ICMLC.2003.1259884
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper deals with Chinese web page classification based on support vector machine (SVM). First, Some methods are proposed for feature extraction and selection based on textual keywords. Then Special problems are discussed on statistic learning theory, support vector machine and their application in classification. Quadratic program algorithm is also described for constructing the SVM classifier. In the experiment part, the sample set, including 5096 samples, is chosen from the web version of Chinese People's Daily. It is separated into two sets, the training set with 3398 samples and the test set with 1698 samples. Two kinds of kernel function, polynomial and radial basis function, are considered in constructing the SVM classifier. The final classification correct rates are 89.81%, 86.51% for the two classifiers, respectively.
引用
收藏
页码:2265 / 2268
页数:4
相关论文
共 50 条
  • [1] Research on SVM-Based Automatic Classification of Chinese Web Page
    Song, Jie
    Liu, Yanque
    Li, Nana
    Gu, Junhua
    PROGRESS IN INTELLIGENCE COMPUTATION AND APPLICATIONS, 2008, : 160 - 164
  • [2] Web page classification based on SVM
    Xue, Weimin
    Bao, Hong
    Xue, Weimin
    Huang, Weitong
    Lu, Yuchang
    WCICA 2006: SIXTH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-12, CONFERENCE PROCEEDINGS, 2006, : 6111 - +
  • [3] A Chinese Web Page Automatic Classification System
    Huang, Rongyou
    Zhao, Xinjian
    WEB INFORMATION SYSTEMS AND MINING, 2010, 6318 : 61 - +
  • [4] Automatic Web Page Classification
    Materna, Jiri
    RASLAN 2008: RECENT ADVANCES IN SLAVONIC NATURAL LANGUAGE PROCESSING: SECOND WORKSHOP, 2008, : 84 - 93
  • [5] On Chinese web page classification
    Liang, JZ
    ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING - ICAISC 2004, 2004, 3070 : 634 - 639
  • [6] An improved SVM web page classification algorithm
    Ren, Xun-yi
    Shi, Chen
    Zhang, Dan
    Wang, Wen-si
    2018 INTERNATIONAL SYMPOSIUM ON POWER ELECTRONICS AND CONTROL ENGINEERING (ISPECE 2018), 2019, 1187
  • [7] Chinese web page classification based on text contents
    Liang, JZ
    ISTM/2003: 5TH INTERNATIONAL SYMPOSIUM ON TEST AND MEASUREMENT, VOLS 1-6, CONFERENCE PROCEEDINGS, 2003, : 4733 - 4736
  • [8] Studies on Chinese web page classification
    Shen, D
    Cong, Y
    Sun, JT
    Lu, YC
    2003 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-5, PROCEEDINGS, 2003, : 23 - 27
  • [9] Chinese Web Page Classification Based on Vector Space Model
    Wei, Li
    Zhang, Ling
    Li, Huamei
    Chen, Xiaozhou
    ADVANCES IN MECHATRONICS, AUTOMATION AND APPLIED INFORMATION TECHNOLOGIES, PTS 1 AND 2, 2014, 846-847 : 1801 - 1804
  • [10] Automatic classification of academic web page types
    Patrick Kenekayoro
    Kevan Buckley
    Mike Thelwall
    Scientometrics, 2014, 101 : 1015 - 1026