Who links to whom: Mining linkage between web sites

被引:50
|
作者
Bharat, K [1 ]
Chang, BW [1 ]
Henzinger, M [1 ]
Ruhl, M [1 ]
机构
[1] Google Inc, Mt View, CA 94043 USA
关键词
D O I
10.1109/ICDM.2001.989500
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Previous studies of the web graph structure have focused on the graph structure at the level of individual pages. In actuality the web is a hierarchically, nested graph, with domains, hosts and web sites introducing intermediate levels of affiliation and administrative control. To better understand the growth of the web we need to understand its macro-structure, in terms of the linkage between web sites. In this paper we approximate this by studying the graph of the linkage between hosts on the web. This was done based on snapshots of the web taken by Google in Oct 1999, Aug 2000 and Jun 2001, The connectivity between hosts is represented by, a directed graph, with hosts as nodes and weighted edges representing the count of hyperlinks between pages on the corresponding hosts. Me demonstrate how such a "hostgraph" can be used to study connectivity properties of hosts and domains oiler time, and discuss a modified "copy model" to explain observed link weight distributions as a function of subgraph size. We discuss changes in the web over time in the size and connectivity of web sites and country domains. We also describe a data mining application of the hostgraph: a related host finding algorithm which achieves a precision of 0.65 at rank 3.
引用
收藏
页码:51 / 58
页数:8
相关论文
共 50 条
  • [1] Who owns whom in mining 1998
    Eng Min J, 9 (34):
  • [2] Who owns whom in mining 1998
    Ericsson, M
    Tegen, A
    E&MJ-ENGINEERING AND MINING JOURNAL, 1998, 199 (09): : 34 - +
  • [3] Relationship between links to journal Web sites and impact factors
    Vaughan, L
    Hysen, K
    ASLIB PROCEEDINGS, 2002, 54 (06): : 356 - 361
  • [4] Exploring the pattern of links between Chinese university Web sites
    Tang, R
    Thelwall, M
    ASIST 2002: PROCEEDINGS OF THE 65TH ASIST ANNUAL MEETING, VOL 39, 2002, 2002, 39 : 417 - 424
  • [5] Who Trades with Whom? Exploring the Links between Firms' International Activities, Skills, and Wages
    Serti, Francesco
    Tomasi, Chiara
    Zanfei, Antonello
    REVIEW OF INTERNATIONAL ECONOMICS, 2010, 18 (05) : 951 - 971
  • [6] Adaptive Web sites by Web usage mining
    Fu, YJ
    Creado, M
    Shih, MY
    IC'2001: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INTERNET COMPUTING, VOLS I AND II, 2001, : 28 - 34
  • [7] Visual web mining of organizational web sites
    Oosthuizen, C.
    Wesson, J.
    Cilliers, C.
    INFORMATION VISUALIZATION-BOOK, 2006, : 395 - +
  • [8] Improving web sites with web usage mining, web content mining, and semantic analysis
    Norguet, JP
    Zimányi, E
    Steinberger, R
    SOFSEM 2006: THEORY AND PRACTICE OF COMPUTER SCIENCE, PROCEEDINGS, 2006, 3831 : 430 - 439
  • [10] The design and implementation of web mining in web sites security
    Jian Li
    Guo-yin Zhang
    Guo-chang Gu
    Jian-li Li
    Journal of Marine Science and Application, 2003, 2 (1) : 81 - 86