Simple classification into large topic ontology of Web documents

被引:0
|
作者
Grobelnik, M [1 ]
Mladenic, D [1 ]
机构
[1] Jozef Stefan Inst, Ljubljana 1000, Slovenia
关键词
classification of documents; topic ontology of Web documents; Web document context; link structure of the Web;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The paper presents an approach to classifying Web documents into large topic ontology. The main emphasis is on having a simple approach appropriate for handling a large ontology and providing it with enriched data by including additional information on the Web page context obtained from the link structure of the Web. The context is generated form the in-coming and out-going links of the Web document we want to classify (the target document), meaning that for representing a document we use, not only text of the document itself but also the text from the documents pointing to the target document as well as the text form the documents that the target document is pointing to. The idea is that providing enriched data is compensating for the simplicity of the approach while keeping it efficient and capable of handling large topic ontology.
引用
收藏
页码:201 / 206
页数:6
相关论文
共 50 条
  • [1] Topic selection of web documents using specific domain ontology
    Kong, Hyunjang
    Hwang, Myunggwon
    Hwang, Gwangsu
    Shim, Jaehong
    Kim, Pankoo
    MICAI 2006: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2006, 4293 : 1047 - +
  • [2] Ontology-based automatic classification of web documents
    Song, MuHee
    Lim, SooYeon
    Kang, DongJin
    Lee, SangJo
    COMPUTATIONAL INTELLIGENCE, PT 2, PROCEEDINGS, 2006, 4114 : 690 - 700
  • [3] Design and implementation of an ontology algorithm for web documents classification
    Wei, Guiyi
    Yu, Jun
    Ling, Yun
    Liu, Jun
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2006, PT 4, 2006, 3983 : 649 - 658
  • [4] Ontology-based automatic classification and ranking for web documents
    Fang, Jun
    Guo, Lei
    Wang, XiaoDong
    Yang, Ning
    FOURTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 3, PROCEEDINGS, 2007, : 627 - 631
  • [5] Ontology based Fuzzy Classification of Web Documents for Semantic Information Retrieval
    Joshi, Kajal
    Verma, Ashish
    Kandpal, Ankita
    Garg, Shalini
    Chauhan, Rashmi
    Goudar, R. H.
    2013 SIXTH INTERNATIONAL CONFERENCE ON CONTEMPORARY COMPUTING (IC3), 2013, : 1 - 5
  • [6] Graph vs. bag representation models for the topic classification of web documents
    Papadakis, George
    Giannakopoulos, George
    Paliouras, Georgios
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2016, 19 (05): : 887 - 920
  • [7] Graph vs. bag representation models for the topic classification of web documents
    George Papadakis
    George Giannakopoulos
    Georgios Paliouras
    World Wide Web, 2016, 19 : 887 - 920
  • [8] A Domain Ontology Learning from Web Documents
    Djaanfar, Ahmed Said
    Frikh, Bouchra
    Ouhbi, Brahim
    INTELLIGENT DISTRIBUTED COMPUTING IV, 2010, 315 : 201 - +
  • [9] Mapping documents onto Web page ontology
    Mladenic, D
    Grobelnik, M
    WEB MINING: FROM WEB TO SEMANTIC WEB, 2004, 3209 : 77 - 96
  • [10] CLASSIFICATION OF SENSITIVE WEB DOCUMENTS
    Gao, Hui
    Fu, Yan
    Li, Jian-Ping
    2008 INTERNATIONAL CONFERENCE ON APPERCEIVING COMPUTING AND INTELLIGENCE ANALYSIS (ICACIA 2008), 2008, : 295 - 298