Towards Efficient SimRank Computation on Large Networks

被引:0
|
作者
Yu, Weiren [1 ]
Lin, Xuemin [1 ]
Zhang, Wenjie [1 ]
机构
[1] Univ New S Wales, Sydney, NSW, Australia
关键词
ALGORITHMS;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
SimRank has been a powerful model for assessing the similarity of pairs of vertices in a graph. It is based on the concept that two vertices are similar if they are referenced by similar vertices. Due to its self-referentiality, fast SimRank computation on large graphs poses significant challenges. The state-of-the-art work [17] exploits partial sums memorization for computing SimRank in O(Kmn) time on a graph with n vertices and m edges, where K is the number of iterations. Partial sums memorizing can reduce repeated calculations by caching part of similarity summations for later reuse. However, we observe that computations among different partial sums may have duplicate redundancy. Besides, for a desired accuracy epsilon, the existing SimRank model requires K = inverted left perepndicularlog(C) epsilon inverted right perpendicular iterations [17], where C is a damping factor. Nevertheless, such a geometric rate of convergence is slow in practice if a high accuracy is desirable. In this paper, we address these gaps. (1) We propose an adaptive clustering strategy to eliminate partial sums redundancy (i.e., duplicate computations occurring in partial sums), and devise an efficient algorithm for speeding up the computation of SimRank to O(Kd'n(2)) time, where d' is typically much smaller than the average in-degree of a graph. (2) We also present a new notion of SimRank that is based on a differential equation and can be represented as an exponential sum of transition matrices, as opposed to the geometric sum of the conventional counterpart. This leads to a further speedup in the convergence rate of SimRank iterations. (3) Using real and synthetic data, we empirically verify that our approach of partial sums sharing outperforms the best known algorithm by up to one order of magnitude, and that our revised notion of SimRank further achieves a 5X speedup on large graphs while also fairly preserving the relative order of original SimRank scores.
引用
收藏
页码:601 / 612
页数:12
相关论文
共 50 条
  • [1] Efficient Partial-Pairs SimRank Search on Large Networks
    Yu, Weiren
    McCann, Julie A.
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2015, 8 (05): : 569 - 580
  • [2] A matrix sampling approach for efficient SimRank computation
    Lu, Juan
    Gong, Zhiguo
    Yang, Yiyang
    INFORMATION SCIENCES, 2021, 556 : 1 - 26
  • [3] A space and time efficient algorithm for SimRank computation
    Weiren Yu
    Wenjie Zhang
    Xuemin Lin
    Qing Zhang
    Jiajin Le
    World Wide Web, 2012, 15 : 327 - 353
  • [4] A space and time efficient algorithm for SimRank computation
    Yu, Weiren
    Zhang, Wenjie
    Lin, Xuemin
    Zhang, Qing
    Le, Jiajin
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2012, 15 (03): : 327 - 353
  • [5] Exact Single-Source SimRank Computation on Large Graphs
    Wang, Hanzhi
    Wei, Zhewei
    Yuan, Ye
    Du, Xiaoyong
    Wen, Ji-Rong
    SIGMOD'20: PROCEEDINGS OF THE 2020 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2020, : 653 - 663
  • [6] Scalable Single-source SimRank Computation for Large Graphs
    Gao, Xingkun
    Bao, Nianyuan
    Liu, Jie
    Tang, Jie
    Wu, Gangshan
    2016 IEEE 22ND INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS), 2016, : 1083 - 1091
  • [7] Towards Efficient Path Skyline Computation in Bicriteria Networks
    Ouyang, Dian
    Yuan, Long
    Zhang, Fan
    Qin, Lu
    Lin, Xuemin
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2018, PT I, 2018, 10827 : 239 - 254
  • [8] SimRank Computation on Uncertain Graphs
    Zhu, Rong
    Zou, Zhaonian
    Li, Jianzhong
    2016 32ND IEEE INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2016, : 565 - 576
  • [9] Towards Energy Efficient Architecture for Spaceborne Neural Networks Computation
    Wang, Shiyu
    Zhang, Shengbing
    Wang, Jihe
    Huang, Xiaoping
    ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2020, PT II, 2020, 12453 : 575 - 586
  • [10] PRSim: Sublinear Time SimRank Computation on Large Power-Law Graphs
    Wei, Zhewei
    He, Xiaodong
    Xiao, Xiaokui
    Wang, Sibo
    Liu, Yu
    Du, Xiaoyong
    Wen, Ji-Rong
    SIGMOD '19: PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2019, : 1042 - 1059