An Improved HITS Algorithm Based on Analysis of Web Page Links and Web Content Similarity

被引:6
|
作者
Yang, Weiming [1 ]
机构
[1] Chongqing Normal Univ, Coll Comp & Informat Sci, Chongqing, Peoples R China
关键词
HITS algorithm; Web content similarity; Authority page; Hub page;
D O I
10.1109/CW.2016.30
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
HITS (HyperLink Induced Topic Search) is a classical link analysis algorithm for analyzing WSM (Web Structure Mining). The algorithm takes into consideration of the structural information of links but ignores the correlation between pages and topics. In some cases, the problem of "topic drift" a deviation between search and topic would appear. For this purpose, the current paper presents an improved algorithm, by taking into account both of the web content similarity and link analysis. Our experiment shows that the improved algorithm has enhanced the correlation of search results and limited the occurrence of topic drift to some degree.
引用
收藏
页码:147 / 150
页数:4
相关论文
共 50 条
  • [1] An Improved HITS Algorithm Based on Page-query Similarity and Page Popularity
    Liu, Xinyue
    Lin, Hongfei
    Zhang, Cong
    JOURNAL OF COMPUTERS, 2012, 7 (01) : 130 - 134
  • [2] Algorithm of Web Page Similarity Comparison Based on Visual Block
    Li, Xingchen
    Zhang, Weizhe
    Wang, Desheng
    Zhang, Bin
    He, Hui
    COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2019, 16 (03) : 815 - 830
  • [3] New use of the HITS algorithm for fast web page classification
    Meadi, Mohamed Nadjib
    Babahenini, Mohamed Chaouki
    Taleb Ahmed, Abdelmalik
    TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2017, 25 (03) : 2015 - 2032
  • [4] An improved PageRank algorithm based on web content
    Zhou Hao
    Pu Qiumei
    Zhang Hong
    Sha Zhihao
    14TH INTERNATIONAL SYMPOSIUM ON DISTRIBUTED COMPUTING AND APPLICATIONS FOR BUSINESS, ENGINEERING AND SCIENCE (DCABES 2015), 2015, : 284 - 287
  • [5] APIMiner: Identifying Web Application APIs Based on Web Page States Similarity Analysis
    Chen, Yuanchao
    Lu, Yuliang
    Pan, Zulie
    Chen, Juxing
    Shi, Fan
    Li, Yang
    Jiang, Yonghui
    ELECTRONICS, 2024, 13 (06)
  • [6] Analysis of web search algorithm hits
    Hong, DW
    Man, SS
    INTERNATIONAL JOURNAL OF FOUNDATIONS OF COMPUTER SCIENCE, 2004, 15 (04) : 649 - 662
  • [7] An improved SVM web page classification algorithm
    Ren, Xun-yi
    Shi, Chen
    Zhang, Dan
    Wang, Wen-si
    2018 INTERNATIONAL SYMPOSIUM ON POWER ELECTRONICS AND CONTROL ENGINEERING (ISPECE 2018), 2019, 1187
  • [8] MixPR-an approach of combining content and links of web page
    Guo, Ye
    FOURTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 2, PROCEEDINGS, 2007, : 456 - +
  • [9] Proposal of Seam Degree and Content Similarity for Web Page Segmentation
    Zeng, Jun
    Flanagan, Brendan
    Xiong, Qingyu
    Wen, Junhao
    Hirokawa, Sachio
    2013 SECOND IIAI INTERNATIONAL CONFERENCE ON ADVANCED APPLIED INFORMATICS (IIAI-AAI 2013), 2013, : 9 - 14
  • [10] Towards an Improved Vision-based Web Page Segmentation Algorithm
    Cormier, Michael
    Mann, Richard
    Moffatt, Karyn
    Cohen, Robin
    2017 14TH CONFERENCE ON COMPUTER AND ROBOT VISION (CRV 2017), 2017, : 345 - 352