An automatic classification technique and tool for information retrieval of web documents

被引:0
|
作者
Di Martino, B [1 ]
Mazzocca, N [1 ]
Squeglia, A [1 ]
Mazzeo, A [1 ]
机构
[1] Univ Naples 2, Dipartimento Ingn Informaz, Naples, Italy
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we describe a technique, and its prototypical implementation, for Information Retrieval of text documents and Web pages and sites. The technique is based on automatic classification methods, more precisely on an unsupervised and hierarchical clustering method. The defined technique produces a hierarchical classification tree from the set of analyzed documents, and therefore classifies them with respect to paradigmatic similarity relationships. This technique has been implemented and a prototype tool has been produced, completely realized in Java. This prototype provides the user with information retrieval functionalities, by means of a search document map based graphic interface. The main functionalities of the tool are described in this paper.
引用
收藏
页码:1043 / 1050
页数:8
相关论文
共 50 条
  • [31] A Web-based Tool for Segmentation and Automatic Transcription of Historical Documents
    Slimane, Fouad
    Mazzei, Andrea
    Topalov, Orlin
    Verzi, Greta
    Kaplan, Frederic
    2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2017, : 2730 - 2737
  • [32] Sense Disambiguation Technique for Information Retrieval in Web Search
    Jain, Rekha
    Purohit, G. N.
    ADVANCES IN COMPUTING AND INFORMATION TECHNOLOGY, VOL 3, 2013, 178 : 451 - 461
  • [33] Automatic documents classification
    Mohamed, Hoda K.
    2007 INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING & SYSTEMS: ICCES '07, 2007, : 33 - 37
  • [34] FOCA: A System for Classification, Digitalization and Information Retrieval of Trial Balance Documents
    Baydar, Gokce Aydugan
    Arslan, Secil
    PROCEEDINGS OF THE 8TH INTERNATIONAL CONFERENCE ON DATA SCIENCE, TECHNOLOGY AND APPLICATIONS (DATA), 2019, : 174 - 181
  • [35] An approach of information extraction from web documents for automatic ontology generation
    Yeom, KW
    Park, JH
    COMPUTATIONAL INTELLIGENCE AND SECURITY, PT 1, PROCEEDINGS, 2005, 3801 : 450 - 457
  • [36] Automatic classification of web information based on site structure
    Gao, KN
    Yang, LM
    Zhang, B
    Chai, QZ
    Ma, AX
    2005 INTERNATIONAL CONFERENCE ON CYBERWORLDS, PROCEEDINGS, 2005, : 552 - 558
  • [37] Design and implementation of a tool for the automatic construction of hypertexts for information retrieval
    Agosti, M
    Crestani, F
    Melucci, M
    INFORMATION PROCESSING & MANAGEMENT, 1996, 32 (04) : 459 - 476
  • [38] An application of information retrieval technique to automated code classification
    Lim, HS
    Lee, SH
    KNOWLEDGE-BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT 1, PROCEEDINGS, 2005, 3681 : 90 - 96
  • [39] RODIN - An E-Science Tool for Managing Information in the Web of Documents and the Web of Knowledge
    Belmonte, Javier
    Blumer, Eliane
    Ricci, Fabio
    Schneider, Rene
    E-SCIENCE AND INFORMATION MANAGEMENT, 2012, 317 : 4 - 12
  • [40] AUTOMATIC INDEXING OF CONNECTED TEXTS OF RETRIEVAL ANNOTATIONS OF DOCUMENTS FOR SEMANTIC INFORMATION SEARCHING
    PASHCHENKO, NA
    NAUCHNO-TEKHNICHESKAYA INFORMATSIYA SERIYA 2-INFORMATSIONNYE PROTSESSY I SISTEMY, 1972, (11): : 38 - 45