Towards Efficient SimRank Computation on Large Networks

被引:0
|
作者
Yu, Weiren [1 ]
Lin, Xuemin [1 ]
Zhang, Wenjie [1 ]
机构
[1] Univ New S Wales, Sydney, NSW, Australia
关键词
ALGORITHMS;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
SimRank has been a powerful model for assessing the similarity of pairs of vertices in a graph. It is based on the concept that two vertices are similar if they are referenced by similar vertices. Due to its self-referentiality, fast SimRank computation on large graphs poses significant challenges. The state-of-the-art work [17] exploits partial sums memorization for computing SimRank in O(Kmn) time on a graph with n vertices and m edges, where K is the number of iterations. Partial sums memorizing can reduce repeated calculations by caching part of similarity summations for later reuse. However, we observe that computations among different partial sums may have duplicate redundancy. Besides, for a desired accuracy epsilon, the existing SimRank model requires K = inverted left perepndicularlog(C) epsilon inverted right perpendicular iterations [17], where C is a damping factor. Nevertheless, such a geometric rate of convergence is slow in practice if a high accuracy is desirable. In this paper, we address these gaps. (1) We propose an adaptive clustering strategy to eliminate partial sums redundancy (i.e., duplicate computations occurring in partial sums), and devise an efficient algorithm for speeding up the computation of SimRank to O(Kd'n(2)) time, where d' is typically much smaller than the average in-degree of a graph. (2) We also present a new notion of SimRank that is based on a differential equation and can be represented as an exponential sum of transition matrices, as opposed to the geometric sum of the conventional counterpart. This leads to a further speedup in the convergence rate of SimRank iterations. (3) Using real and synthetic data, we empirically verify that our approach of partial sums sharing outperforms the best known algorithm by up to one order of magnitude, and that our revised notion of SimRank further achieves a 5X speedup on large graphs while also fairly preserving the relative order of original SimRank scores.
引用
收藏
页码:601 / 612
页数:12
相关论文
共 50 条
  • [21] Numerically efficient computation of the survival signature for the reliability analysis of large networks
    Behrensdorf, Jasper
    Regenhardt, Tobias-Emanuel
    Broggi, Matteo
    Beer, Michael
    Reliability Engineering and System Safety, 2021, 216
  • [22] Efficient Betweenness Centrality Computation over Large Heterogeneous Information Networks
    Wang, Xinrui
    Wang, Yiran
    Lin, Xuemin
    Yu, Jeffrey Xu
    Gao, Hong
    Cheng, Xiuzhen
    Yu, Dongxiao
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2024, 17 (11): : 3360 - 3372
  • [23] Probabilistic SimRank computation over uncertain graphs
    Du, Lingxia
    Li, Cuiping
    Chen, Hong
    Tan, Liwen
    Zhang, Yinglong
    INFORMATION SCIENCES, 2015, 295 : 521 - 535
  • [24] Numerically efficient computation of the survival signature for the reliability analysis of large networks
    Behrensdorf, Jasper
    Regenhardt, Tobias-Emanuel
    Broggi, Matteo
    Beer, Michael
    RELIABILITY ENGINEERING & SYSTEM SAFETY, 2021, 216
  • [25] Accuracy Estimate and Optimization Techniques for SimRank Computation
    Lizorkin, Dmitry
    Velikhov, Pavel
    Grinev, Maxim
    Turdakov, Denis
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2008, 1 (01): : 422 - 433
  • [26] Efficient Search Algorithm for SimRank
    Fujiwara, Yasuhiro
    Nakatsuji, Makoto
    Shiokawa, Hiroaki
    Onizuka, Makoto
    2013 IEEE 29TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2013, : 589 - 600
  • [27] Asyn-SimRank: An asynchronous large-scale simrank algorithm
    Wang, Chunlei
    Zhang, Yanfeng
    Bao, Yubin
    Zhao, Changkuan
    Yu, Ge
    Gao, Lixin
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2015, 52 (07): : 1567 - 1579
  • [28] METHOD FOR EFFICIENT COMPUTATION OF LARGE-CHANGE SENSITIVITY OF LINEAR NONRECIPROCAL NETWORKS
    GODDARD, PJ
    VILLALAZ, PA
    SPENCE, R
    ELECTRONICS LETTERS, 1971, 7 (04) : 112 - &
  • [29] Computation of Spectra of Large Networks
    Erdem, O.
    Karasozen, B.
    Sariaydin, A.
    ICMS: INTERNATIONAL CONFERENCE ON MATHEMATICAL SCIENCE, 2010, 1309 : 846 - 851
  • [30] Using Graphics Processors for High Performance SimRank Computation
    He, Guoming
    Li, Cuiping
    Chen, Hong
    Du, Xiaoyong
    Feng, Haijun
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2012, 24 (09) : 1711 - 1725