Factors affecting web page similarity

被引:0
|
作者
Tombros, A [1 ]
Ali, ZS [1 ]
机构
[1] Queen Mary Univ London, Dept Comp Sci, London E1 4NS, England
来源
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Tools that allow effective information organisation, access and navigation are becoming increasingly important on the Web. Similarity between web pages is a concept that is central to such tools. In this paper, we examine the effect that content and layout-related aspects of web pages have on web page similarity. We consider the textual content contained within common HTML tags, the structural layout of pages, and the query terms contained within pages. Our study shows that combinations of factors can yield more promising results than individual factors, and that different aspects of web pages affect similarities between pages in a different manner. We found a number of factors that, when taken into account, can result in effective measures of similarity between web pages. Query information in particular, proved to be important for the effective organisation of web pages.
引用
收藏
页码:487 / 501
页数:15
相关论文
共 50 条
  • [31] An optimized approach for massive web page classification using entity similarity based on semantic network
    Li, Huakang
    Xu, Zheng
    Li, Tao
    Sun, Guozi
    Choo, Kim-Kwang Raymond
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2017, 76 : 510 - 518
  • [32] Web Server for Web Page Fingerprinting
    Park, Subin
    Cho, Dongsub
    2008 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING, VOLS 1 AND 2, 2008, : 530 - 533
  • [33] Web credibility assessment: affecting factors and assessment techniques
    Shah, Asad Ali
    Ravana, Sri Devi
    Hamid, Suraya
    Ismail, Maizatul Akmar
    INFORMATION RESEARCH-AN INTERNATIONAL ELECTRONIC JOURNAL, 2015, 20 (01):
  • [34] Factors affecting Web 2.0 adoption: a case study
    Isfandyari-Moghaddam, Alireza
    Hosseini-Shoar, Mansoureh
    PROGRAM-ELECTRONIC LIBRARY AND INFORMATION SYSTEMS, 2014, 48 (01) : 2 - 15
  • [35] An empirical research on factors affecting web search behavior
    Shi, Yifei (syf1215@qq.com), 1600, Editorial Board of Medical Journal of Wuhan University (41):
  • [36] Factors Affecting the Development and Management of Bilingual Web Portals
    Altayar, Mohammed Saleh
    WORLD CONGRESS ON ENGINEERING AND COMPUTER SCIENCE, WCECS 2015, VOL I, 2015, : 424 - 427
  • [37] Factors Affecting Response Rates of the Web Survey with Teachers
    Lavidas, Konstantinos
    Petropoulou, Antonia
    Papadakis, Stamatios
    Apostolou, Zoi
    Komis, Vassilis
    Jimoyiannis, Athanassios
    Gialamas, Vasilis
    COMPUTERS, 2022, 11 (09)
  • [38] Factors Affecting Web Disclosure Adoption in the Nonprofit Sector
    Lee, Roderick L.
    Blouin, Marie C.
    JOURNAL OF COMPUTER INFORMATION SYSTEMS, 2019, 59 (04) : 363 - 372
  • [39] Factors Affecting Website Reconstruction from the Web Infrastructure
    McCown, Frank
    Diawara, Norou
    Nelson, Michael L.
    PROCEEDINGS OF THE 7TH ACM/IEE JOINT CONFERENCE ON DIGITAL LIBRARIES: BUILDING & SUSTAINING THE DIGITAL ENVIRONMENT, 2007, : 39 - +
  • [40] Factors affecting the information quality of personal Web portfolios
    Katerattanakul, Pairin
    Siau, Keng
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2008, 59 (01): : 63 - 76