An Improved HITS Algorithm Based on Analysis of Web Page Links and Web Content Similarity

被引:6
|
作者
Yang, Weiming [1 ]
机构
[1] Chongqing Normal Univ, Coll Comp & Informat Sci, Chongqing, Peoples R China
关键词
HITS algorithm; Web content similarity; Authority page; Hub page;
D O I
10.1109/CW.2016.30
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
HITS (HyperLink Induced Topic Search) is a classical link analysis algorithm for analyzing WSM (Web Structure Mining). The algorithm takes into consideration of the structural information of links but ignores the correlation between pages and topics. In some cases, the problem of "topic drift" a deviation between search and topic would appear. For this purpose, the current paper presents an improved algorithm, by taking into account both of the web content similarity and link analysis. Our experiment shows that the improved algorithm has enhanced the correlation of search results and limited the occurrence of topic drift to some degree.
引用
收藏
页码:147 / 150
页数:4
相关论文
共 50 条
  • [21] Improved Subgraph Estimation PageRank Algorithm for Web Page Rank
    Li, Lanying
    Zhou, Qiuli
    Kong, Yin
    Dong, Yiming
    INTERNATIONAL JOURNAL OF SECURITY AND ITS APPLICATIONS, 2015, 9 (11): : 231 - 238
  • [22] Measuring Web Page Similarity Based on Textual and Visual Properties
    Bartik, Vladimir
    ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, PT II, 2012, 7268 : 13 - 21
  • [23] Layout-based computation of web page similarity ranks
    Bozkir, Ahmet Selman
    Sezer, Ebru Akcapinar
    INTERNATIONAL JOURNAL OF HUMAN-COMPUTER STUDIES, 2018, 110 : 95 - 114
  • [24] Web Phishing Detection Based on Page Spatial Layout Similarity
    Zhang, Weifeng
    Lu, Hua
    Xu, Baowen
    Yang, Hongji
    INFORMATICA-JOURNAL OF COMPUTING AND INFORMATICS, 2013, 37 (03): : 231 - 244
  • [25] A web content manipulation technique based on page Fragmentation
    Christos, Bouras
    Giorgos, Kounenis
    Ioannis, Misedakis
    JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2007, 30 (02) : 563 - 585
  • [26] Filtering Useless Links from Web Pages based on Content Analysis
    Wang, Min
    Shi, Lei
    Yan, Lun
    Zheng, Jun
    2010 INTERNATIONAL CONFERENCE ON EDUCATION AND SPORTS EDUCATION, VOL 1, 2010, : 126 - 131
  • [27] Web Page Classification Algorithm Based on Deep Learning
    Yu, Yuanhui
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022
  • [28] A Web Page Classification Algorithm Based On Link Information
    Xu, Zhaohui
    Yan, Fuliang
    Qin, Jie
    Zhu, Haifeng
    2011 TENTH INTERNATIONAL SYMPOSIUM ON DISTRIBUTED COMPUTING AND APPLICATIONS TO BUSINESS, ENGINEERING AND SCIENCE (DCABES), 2011, : 82 - 86
  • [29] Adjustment of web page hyperlink based on greedy algorithm
    Chen, QZ
    Zhang, WY
    Chu, YQ
    Chen, XY
    Han, JG
    PROCEEDINGS OF THE 11TH JOINT INTERNATIONAL COMPUTER CONFERENCE, 2005, : 214 - 217
  • [30] Adjustment of web page hyperlink based on greedy algorithm
    Chen, Qingzhang
    Chen, Xiaoyin
    Cao, Che
    Gu, Yujie
    2005 INTERNATIONAL SYMPOSIUM ON COMPUTER SCIENCE AND TECHNOLOGY, PROCEEDINGS, 2005, : 424 - 428