Accuracy estimate and optimization techniques for SimRank computation

被引:67
|
作者
Lizorkin, Dmitry [1 ]
Velikhov, Pavel [1 ]
Grinev, Maxim [1 ]
Turdakov, Denis [1 ]
机构
[1] Russian Acad Sci, Inst Syst Programming, Moscow 109004, Russia
来源
VLDB JOURNAL | 2010年 / 19卷 / 01期
关键词
Similarity measure; Graph theory; SimRank; Algorithm; Computational complexity;
D O I
10.1007/s00778-009-0168-8
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The measure of similarity between objects is a very useful tool in many areas of computer science, including information retrieval. SimRank is a simple and intuitive measure of this kind, based on a graph-theoretic model. SimRank is typically computed iteratively, in the spirit of PageRank. However, existing work on SimRank lacks accuracy estimation of iterative computation and has discouraging time complexity. In this paper, we present a technique to estimate the accuracy of computing SimRank iteratively. This technique provides a way to find out the number of iterations required to achieve a desired accuracy when computing SimRank. We also present optimization techniques that improve the computational complexity of the iterative algorithm from O(n(4)) in the worst case to min(O(nl), O(n(3)/log(2)n)), with n denoting the number of objects, and l denoting the number object-to-object relationships. We also introduce a threshold sieving heuristic and its accuracy estimation that further improves the efficiency of the method. As a practical illustration of our techniques, we computed SimRank scores on a subset of English Wikipedia corpus, consisting of the complete set of articles and category links.
引用
收藏
页码:45 / 66
页数:22
相关论文
共 50 条
  • [1] Accuracy estimate and optimization techniques for SimRank computation
    Dmitry Lizorkin
    Pavel Velikhov
    Maxim Grinev
    Denis Turdakov
    The VLDB Journal, 2010, 19 : 45 - 66
  • [2] Accuracy Estimate and Optimization Techniques for SimRank Computation
    Lizorkin, Dmitry
    Velikhov, Pavel
    Grinev, Maxim
    Turdakov, Denis
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2008, 1 (01): : 422 - 433
  • [3] Accuracy Estimate and Optimization Techniques for SuperSimRank Computation on Massive Graphs
    Zhang Y.-L.
    Xia X.-W.
    Yu Y.
    Deng Z.-G.
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2019, 47 (07): : 1591 - 1595
  • [4] SimRank Computation on Uncertain Graphs
    Zhu, Rong
    Zou, Zhaonian
    Li, Jianzhong
    2016 32ND IEEE INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2016, : 565 - 576
  • [5] NUMERICAL COMPUTATION OF MATRIX EXPONENTIAL WITH ACCURACY ESTIMATE
    WARD, RC
    SIAM JOURNAL ON NUMERICAL ANALYSIS, 1977, 14 (04) : 600 - 610
  • [6] Probabilistic SimRank computation over uncertain graphs
    Du, Lingxia
    Li, Cuiping
    Chen, Hong
    Tan, Liwen
    Zhang, Yinglong
    INFORMATION SCIENCES, 2015, 295 : 521 - 535
  • [7] Towards Efficient SimRank Computation on Large Networks
    Yu, Weiren
    Lin, Xuemin
    Zhang, Wenjie
    2013 IEEE 29TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2013, : 601 - 612
  • [8] A matrix sampling approach for efficient SimRank computation
    Lu, Juan
    Gong, Zhiguo
    Yang, Yiyang
    INFORMATION SCIENCES, 2021, 556 : 1 - 26
  • [9] A space and time efficient algorithm for SimRank computation
    Weiren Yu
    Wenjie Zhang
    Xuemin Lin
    Qing Zhang
    Jiajin Le
    World Wide Web, 2012, 15 : 327 - 353
  • [10] A space and time efficient algorithm for SimRank computation
    Yu, Weiren
    Zhang, Wenjie
    Lin, Xuemin
    Zhang, Qing
    Le, Jiajin
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2012, 15 (03): : 327 - 353