Practical Recommendations on Crawling Online Social Networks

被引:130
|
作者
Gjoka, Minas [1 ]
Kurant, Maciej [1 ]
Butts, Carter T. [1 ,2 ]
Markopoulou, Athina [1 ,3 ]
机构
[1] Univ Calif Irvine, Calif Inst Telecomm & Informat Technol CalIT2, Irvine, CA 92697 USA
[2] Univ Calif Irvine, Dept Sociol, Irvine, CA 92697 USA
[3] Univ Calif Irvine, Dept EECS, Irvine, CA 92697 USA
基金
瑞士国家科学基金会; 美国国家科学基金会;
关键词
Sampling methods; Social network services; Facebook; Random Walks; Convergence; Measurements; Graph sampling;
D O I
10.1109/JSAC.2011.111011
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Our goal in this paper is to develop a practical framework for obtaining a uniform sample of users in an online social network (OSN) by crawling its social graph. Such a sample allows to estimate any user property and some topological properties as well. To this end, first, we consider and compare several candidate crawling techniques. Two approaches that can produce approximately uniform samples are the Metropolis-Hasting random walk (MHRW) and a re-weighted random walk (RWRW). Both have pros and cons, which we demonstrate through a comparison to each other as well as to the "ground truth." In contrast, using Breadth-First-Search (BFS) or an unadjusted Random Walk (RW) leads to substantially biased results. Second, and in addition to offline performance assessment, we introduce online formal convergence diagnostics to assess sample quality during the data collection process. We show how these diagnostics can be used to effectively determine when a random walk sample is of adequate size and quality. Third, as a case study, we apply the above methods to Facebook and we collect the first, to the best of our knowledge, representative sample of Facebook users. We make it publicly available and employ it to characterize several key properties of Facebook.
引用
收藏
页码:1872 / 1892
页数:21
相关论文
共 50 条
  • [31] Probabilistic Spreading of Recommendations in Social Networks
    Davoudi, Anahita
    Chatterjee, Mainak
    2015 IEEE MILITARY COMMUNICATIONS CONFERENCE (MILCOM 2015), 2015, : 1373 - 1378
  • [32] Online social networks in economics
    Mayer, Adalbert
    DECISION SUPPORT SYSTEMS, 2009, 47 (03) : 169 - 184
  • [33] Analyzing Online Social Networks
    Howard, Bill
    COMMUNICATIONS OF THE ACM, 2008, 51 (11) : 14 - 16
  • [34] Online social support networks
    Mehta, Neil
    Atreja, Ashish
    INTERNATIONAL REVIEW OF PSYCHIATRY, 2015, 27 (02) : 118 - 123
  • [35] Online social networks and wellbeing
    Surnmerskill, Benjamin
    LANCET, 2009, 374 (9689): : 514 - 514
  • [36] Foraging Online Social Networks
    Koot, Gijs
    in 't Veld, Mirjam A. A. Huis
    Hendricksen, Joost
    Kaptein, Rianne
    de Vries, Arnout
    van den Broek, Egon L.
    2014 IEEE JOINT INTELLIGENCE AND SECURITY INFORMATICS CONFERENCE (JISIC), 2014, : 312 - 315
  • [37] Benchmarking Online Social Networks
    Nicolas Terevinto, Pablo
    Perez, Miguel
    Domenech, Josep
    Gil, Jose A.
    Pont, Ana
    PROCEEDINGS OF THE 2016 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING ASONAM 2016, 2016, : 164 - 169
  • [38] Online social networks and learning
    Greenhow, Christine
    ON THE HORIZON, 2011, 19 (01) : 4 - +
  • [39] Online social networks are not addictive
    Carbonell, Xavier
    Oberst, Ursula
    ALOMA-REVISTA DE PSICOLOGIA CIENCIES DE L EDUCACIO I DE L ESPORT, 2015, 33 (02): : 13 - 19
  • [40] Sampling Online Social Networks
    Papagelis, Manos
    Das, Gautam
    Koudas, Nick
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2013, 25 (03) : 662 - 676