Visual similarity comparison for Web page retrieval

被引:14
|
作者
Takama, Y [1 ]
Mitsuhashi, N [1 ]
机构
[1] Tokyo Metropolitan Univ, Tokyo 158, Japan
关键词
D O I
10.1109/WI.2005.157
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A Comparison method for Web pages in terms of visual similarity is proposed Conventional Web Information retrieval/gathering systems, such as search engines, extract keywords from HTML source files, based on which the similarity between pages is calculated. The extracted keywords are considered as semantic features representing the contents of Web pages. On the other hand, visual feature of Web pages is as important as semantic feature, because HTML is designed for visualizing a Web page in understandable manner for humans. The proposed method compares the layouts of Web pages based on image processing and graph matching. The experimental results show that the accuracy of layout analysis is 91.6% in average, and the visual similarity calculated by the proposed method is closer to the visual judgment by test subjects than color-based comparison method.
引用
收藏
页码:301 / 304
页数:4
相关论文
共 50 条
  • [1] Algorithm of Web Page Similarity Comparison Based on Visual Block
    Li, Xingchen
    Zhang, Weizhe
    Wang, Desheng
    Zhang, Bin
    He, Hui
    COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2019, 16 (03) : 815 - 830
  • [2] Measuring Web Page Similarity Based on Textual and Visual Properties
    Bartik, Vladimir
    ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, PT II, 2012, 7268 : 13 - 21
  • [3] SimiLay: A Developing Web Page Layout Based Visual Similarity Search Engine
    Bozkir, Ahmet Selman
    Sezer, Ebru Akcapinar
    MACHINE LEARNING AND DATA MINING IN PATTERN RECOGNITION, MLDM 2014, 2014, 8556 : 457 - 470
  • [4] A web-based protein retrieval system by matching visual similarity
    Yeh, JS
    Chen, DY
    Ming, O
    2005 Emerging Information Technology Conference (EITC), 2005, : 177 - 179
  • [5] Factors affecting web page similarity
    Tombros, A
    Ali, ZS
    ADVANCES IN INFORMATION RETRIEVAL, 2005, 3408 : 487 - 501
  • [6] Web page retrieval by combining evidence
    Figuerola, Carlos G.
    Alonso Berrocal, Jose L.
    Zazo, Angel F.
    Vazquez de Aldana, Emilio Rodriguez
    ACCESSING MULTILINGUAL INFORMATION REPOSITORIES, 2006, 4022 : 880 - 887
  • [7] Clustering web sessions by levels of page similarity
    Nichele, Caren Moraes
    Becker, Karin
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2006, 3918 : 346 - 350
  • [8] Similarity Measures for Visual Comparison and Retrieval of Test Data in Aluminum Production
    Jekic, Nikolina
    Mutlu, Belgin
    Schreyer, Manuela
    Neubert, Steffen
    Schreck, Tobias
    IVAPP: PROCEEDINGS OF THE 16TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS - VOL. 3: IVAPP, 2021, : 210 - 218
  • [9] Cross-Browser Differences Detection Based on an Empirical Metric for Web Page Visual Similarity
    Xu, Zhen
    Miller, James
    ACM TRANSACTIONS ON INTERNET TECHNOLOGY, 2018, 18 (03)
  • [10] Retrieval of document images based on page layout similarity
    Naveen
    Guru, D. S.
    ADAPTIVE MULTIMEDIA RETRIEVAL: USER, CONTEXT, AND FEEDBACK, 2007, 4398 : 136 - +