An automatic classification technique and tool for information retrieval of web documents

被引:0
|
作者
Di Martino, B [1 ]
Mazzocca, N [1 ]
Squeglia, A [1 ]
Mazzeo, A [1 ]
机构
[1] Univ Naples 2, Dipartimento Ingn Informaz, Naples, Italy
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we describe a technique, and its prototypical implementation, for Information Retrieval of text documents and Web pages and sites. The technique is based on automatic classification methods, more precisely on an unsupervised and hierarchical clustering method. The defined technique produces a hierarchical classification tree from the set of analyzed documents, and therefore classifies them with respect to paradigmatic similarity relationships. This technique has been implemented and a prototype tool has been produced, completely realized in Java. This prototype provides the user with information retrieval functionalities, by means of a search document map based graphic interface. The main functionalities of the tool are described in this paper.
引用
收藏
页码:1043 / 1050
页数:8
相关论文
共 50 条
  • [41] A Self-Organizing System for Automatic Information Classification and Retrieval
    Greene, Marjorie
    2014 IEEE CONFERENCE ON NORBERT WIENER IN THE 21ST CENTURY (21CW), 2014,
  • [42] SemCrawl: Framework for Crawling Ontology Annotated Web Documents for Intelligent Information Retrieval
    Dhingra, Vandana
    Bhatia, Komal Kumar
    INTELLIGENT DISTRIBUTED COMPUTING, 2015, 321 : 213 - 223
  • [43] StoryTracker: A Semantic-Oriented Tool for Automatic Tracking Events by Web Documents
    Santos, Welton
    Fazzion, Elverton
    Tuler, Elisa
    Dias, Diego
    Guimaraes, Marcelo
    Rocha, Leonardo
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS, ICCSA 2021, PT III, 2021, 12951 : 126 - 140
  • [44] Improving the automatic retrieval of text documents
    Agosti, M
    Bacchin, M
    Ferro, N
    Melucci, M
    ADVANCES IN CROSS-LANGUAGE INFORMATION RETRIEVAL, 2003, 2785 : 279 - 290
  • [45] INFORMATION RETRIEVAL FOR SHORT DOCUMENTS
    Qi Haoliang Li Mu Gao Jianfeng Li Sheng Ministry of Education Microsoft Key Laboratory of Natural Language Processing and Speech Harbin Institute of Technology Harbin China Microsoft Research Asia Beijing China Microsoft Research Redmond WA USA
    JournalofElectronics, 2006, (06) : 933 - 936
  • [46] ANNOTATIONS ON DOCUMENTS FOR INFORMATION RETRIEVAL
    Patil, Vishal A.
    Khambre, Pankaj
    2016 INTERNATIONAL CONFERENCE ON COMPUTING COMMUNICATION CONTROL AND AUTOMATION (ICCUBEA), 2016,
  • [47] INFORMATION RETRIEVAL FOR SHORT DOCUMENTS
    Qi Haoliang Li Mu* Gao Jianfeng** Li Sheng (Ministry of Education - Microsoft Key Laboratory of Natural Language Processing and Speech (Harbin Institute of Technology)
    Journal of Electronics(China), 2006, (06) : 933 - 936
  • [48] Improving concept hierarchy development for web returned documents using automatic classification
    Wu, YFB
    Bot, RS
    Chen, X
    Li, QZ
    ICOMP '05: PROCEEDINGS OF THE 2005 INTERNATIONAL CONFERENCE ON INTERNET COMPUTING, 2005, : 99 - 105
  • [49] Automatic Genre Classification of Web Documents Using Discriminant Analysis for Feature Selection
    Maeda, Akira
    Hayashi, Yukinori
    2009 SECOND INTERNATIONAL CONFERENCE ON THE APPLICATIONS OF DIGITAL INFORMATION AND WEB TECHNOLOGIES (ICADIWT 2009), 2009, : 405 - +
  • [50] A novel context-based technique for web information retrieval
    Zakos, John
    Verma, Brijesh
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2006, 9 (04): : 485 - 503