Towards an intelligent text categorization for web resources: An implementation

被引:0
|
作者
Zadrozny, S [1 ]
Lawcewicz, K [1 ]
Kacprzyk, J [1 ]
机构
[1] Polish Acad Sci, Syst Res Inst, PL-01447 Warsaw, Poland
关键词
automatic classification of documents; Internet; linguistic terms;
D O I
10.1016/B978-044451379-3/50012-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose the concept and implementation of a software system, TCAT (Text CATegorization) system, for an automatic recognition of a topic of an Internet document. In the training mode the user provides the system with a list of topics and sets of documents representing each topic (supervised learning). In the recognition mode the system automatically classifies previously unseen document to a topic category. A simple learning algorithm is devised and implemented. The results of the classification are presented to the user in the form of a set of linguistic terms. Some new measures of correctness of the classification are proposed. The implemented system processes documents in several popular Internet-related formats.
引用
收藏
页码:153 / 164
页数:12
相关论文
共 50 条
  • [41] Towards Intelligent Semantic Caching for Web Sources
    Dongwon Lee
    Wesley W. Chu
    Journal of Intelligent Information Systems, 2001, 17 : 23 - 45
  • [42] Towards Intelligent Wireless Web Services for Tourism
    Kanellopoulos, Dimitris
    Kotsiantis, Sotiris
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2006, 6 (7B): : 83 - 90
  • [43] Towards intelligent semantic caching for web sources
    Lee, D
    Chu, WW
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2001, 17 (01) : 23 - 45
  • [44] Towards automatic and optimal Filtering Levels for Feature Selection in Text Categorization
    Montañés, E
    Combarro, EF
    Díaz, I
    Ranilla, J
    ADVANCES IN INTELLIGENT DATA ANALYSIS VI, PROCEEDINGS, 2005, 3646 : 239 - 248
  • [45] Towards an Intelligent Framework to Understand and Feed the Web
    Fensel, Anna
    Neidhardt, Julia
    Pobiedina, Nataliia
    Fensel, Dieter
    Werthner, Hannes
    BUSINESS INFORMATION SYSTEMS WORKSHOPS, BIS 2012, 2012, 127 : 255 - 266
  • [46] Theme section "Towards Intelligent Geoprocessing on the Web"
    Li, Songnian
    Dragicevic, Suzana
    Veenendaal, Bert
    Brovelli, Maria Antonia
    ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2013, 83 : 138 - 139
  • [47] The Design and Implementation of an Intelligent Distributed Text Retrieval System
    Yu, Wang
    Wang, Guohua
    2012 2ND INTERNATIONAL CONFERENCE ON UNCERTAINTY REASONING AND KNOWLEDGE ENGINEERING (URKE), 2012, : 189 - 192
  • [48] to:// Towards an Open Names pace for Web Resources
    Garcia Lopez, Pedro
    Espelt Palau, Marc
    20TH ACM CONFERENCE ON HYPERTEXT AND HYPERMEDIA (HYPERTEXT 2009), 2009, : 335 - 336
  • [49] The study on Web product reviews mining based on an improved text categorization algorithm
    Hu, Dongbin
    Luo, Lixia
    Xu, Lihua
    ELECTRONIC-BUSINESS INTELLIGENCE: FOR CORPORATE COMPETITIVE ADVANTAGES IN THE AGE OF EMERGING TECHNOLOGIES & GLOBALIZATION, 2010, 14 : 449 - 455
  • [50] Web Text Categorization for Enterprise Decision Support Based on SVMs - An Application of GBODSS
    Jia, Zhijuan
    Hu, Mingsheng
    Song, Haigang
    Hong, Liu
    ADVANCES IN NEURAL NETWORKS - ISNN 2009, PT 2, PROCEEDINGS, 2009, 5552 : 753 - +