Simple classification into large topic ontology of Web documents

被引:0
|
作者
Grobelnik, M [1 ]
Mladenic, D [1 ]
机构
[1] Jozef Stefan Inst, Ljubljana 1000, Slovenia
关键词
classification of documents; topic ontology of Web documents; Web document context; link structure of the Web;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The paper presents an approach to classifying Web documents into large topic ontology. The main emphasis is on having a simple approach appropriate for handling a large ontology and providing it with enriched data by including additional information on the Web page context obtained from the link structure of the Web. The context is generated form the in-coming and out-going links of the Web document we want to classify (the target document), meaning that for representing a document we use, not only text of the document itself but also the text from the documents pointing to the target document as well as the text form the documents that the target document is pointing to. The idea is that providing enriched data is compensating for the simplicity of the approach while keeping it efficient and capable of handling large topic ontology.
引用
收藏
页码:201 / 206
页数:6
相关论文
共 50 条
  • [31] Ontology based semantic annotation of Urdu language web documents
    Rajput, Quratulain
    KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS 18TH ANNUAL CONFERENCE, KES-2014, 2014, 35 : 662 - 670
  • [32] Tailoring dynamic ontology-driven web documents by demonstration
    Macías, JA
    Castells, P
    SIXTH INTERNATIONAL CONFERENCE ON INFORMATION VISUALISATION, PROCEEDINGS, 2002, : 535 - 540
  • [33] A simple fuzzy extension to the search of documents on the Web
    Di Lascio, L
    Fischetti, E
    Gisolfi, A
    Nappi, A
    Santangelo, A
    MULTIMEDIA DATABASES AND IMAGE COMMUNICATION, 2004, 17 : 21 - 30
  • [34] Automated subject classification of textual web documents
    Golub, Koraljka
    JOURNAL OF DOCUMENTATION, 2006, 62 (03) : 350 - 371
  • [35] Classification of web documents using graph matching
    Schenker, A
    Last, M
    Bunke, H
    Kandel, A
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2004, 18 (03) : 475 - 496
  • [36] Hierarchical Multidimensional Classification of Web Documents with MultiWebClass
    Serafino, Francesco
    Pio, Gianvito
    Ceci, Michelangelo
    Malerba, Donato
    DISCOVERY SCIENCE, DS 2015, 2015, 9356 : 236 - 250
  • [37] Automated classification of web documents into a hierarchy of categories
    Ceci, M
    Esposito, F
    Lapi, M
    Malerba, D
    INTELLIGENT INFORMATION PROCESSING AND WEB MINING, 2003, : 59 - 68
  • [38] Convolutional Neural Networks for Web Documents Classification
    Artene, Codrut-Georgian
    Tibeica, Marius Nicolae
    Vecliuc, Dumitru Daniel
    Leon, Florin
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2021, 2021, 12672 : 289 - 302
  • [39] Adaptive classification of Web documents to users interests
    Potamias, G
    ADVANCES IN INFORMATICS, 2003, 2563 : 147 - 158
  • [40] Classification of web documents using a graph model
    Schenker, A
    Last, M
    Bunke, H
    Kandel, A
    SEVENTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS I AND II, PROCEEDINGS, 2003, : 240 - 244