Towards an intelligent text categorization for web resources: An implementation

被引:0
|
作者
Zadrozny, S [1 ]
Lawcewicz, K [1 ]
Kacprzyk, J [1 ]
机构
[1] Polish Acad Sci, Syst Res Inst, PL-01447 Warsaw, Poland
关键词
automatic classification of documents; Internet; linguistic terms;
D O I
10.1016/B978-044451379-3/50012-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose the concept and implementation of a software system, TCAT (Text CATegorization) system, for an automatic recognition of a topic of an Internet document. In the training mode the user provides the system with a list of topics and sets of documents representing each topic (supervised learning). In the recognition mode the system automatically classifies previously unseen document to a topic category. A simple learning algorithm is devised and implemented. The results of the classification are presented to the user in the form of a set of linguistic terms. Some new measures of correctness of the classification are proposed. The implemented system processes documents in several popular Internet-related formats.
引用
收藏
页码:153 / 164
页数:12
相关论文
共 50 条
  • [1] Text categorization in an intelligent agent for filtering information on the Web
    Gentili, GL
    Marinilli, M
    Micarelli, A
    Sciarrone, F
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2001, 15 (03) : 527 - 549
  • [2] Text Categorization: Implementation
    Jo, Taeho
    Studies in Big Data, 2019, 45 : 129 - 156
  • [3] Web Text Categorization on GBODSS
    Hu, Mingsheng
    Jia, Zhijuan
    ICCSSE 2009: PROCEEDINGS OF 2009 4TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE & EDUCATION, 2009, : 599 - +
  • [4] Hybrid Intelligent Techniques for Text Categorization
    Sadiq, Ahmed T.
    Abdullah, Sura Mahmood
    2012 INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER SCIENCE APPLICATIONS AND TECHNOLOGIES (ACSAT), 2012, : 238 - 245
  • [5] Speedup learning for text categorization and intelligent agents
    Goldberg, JL
    Jenkins, ML
    INFORMATION TECHNOLOGY AND ORGANIZATIONS: TRENDS, ISSUES, CHALLENGES AND SOLUTIONS, VOLS 1 AND 2, 2003, : 893 - 895
  • [6] TEXT CATEGORIZATION AND SORTING OF WEB SEARCH RESULTS
    Radovanovic, Milos
    Ivanovic, Mirjana
    Budimac, Zoran
    COMPUTING AND INFORMATICS, 2009, 28 (06) : 861 - 893
  • [7] Design and implementation of a fast text categorization algorithm
    School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
    不详
    Beijing Ligong Daxue Xuebao, 2006, 12 (1069-1072):
  • [8] The design and implementation of an excellent text categorization system
    Lu, MY
    Diao, LL
    Lu, YC
    Zhou, LZ
    PROCEEDINGS OF THE 4TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-4, 2002, : 459 - 463
  • [9] Exploiting semantic resources for large scale text categorization
    Jian Qiang Li
    Yu Zhao
    Bo Liu
    Journal of Intelligent Information Systems, 2012, 39 : 763 - 788
  • [10] Exploiting semantic resources for large scale text categorization
    Li, Jian Qiang
    Zhao, Yu
    Liu, Bo
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2012, 39 (03) : 763 - 788