Practical Recommendations on Crawling Online Social Networks

被引:130
|
作者
Gjoka, Minas [1 ]
Kurant, Maciej [1 ]
Butts, Carter T. [1 ,2 ]
Markopoulou, Athina [1 ,3 ]
机构
[1] Univ Calif Irvine, Calif Inst Telecomm & Informat Technol CalIT2, Irvine, CA 92697 USA
[2] Univ Calif Irvine, Dept Sociol, Irvine, CA 92697 USA
[3] Univ Calif Irvine, Dept EECS, Irvine, CA 92697 USA
基金
瑞士国家科学基金会; 美国国家科学基金会;
关键词
Sampling methods; Social network services; Facebook; Random Walks; Convergence; Measurements; Graph sampling;
D O I
10.1109/JSAC.2011.111011
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Our goal in this paper is to develop a practical framework for obtaining a uniform sample of users in an online social network (OSN) by crawling its social graph. Such a sample allows to estimate any user property and some topological properties as well. To this end, first, we consider and compare several candidate crawling techniques. Two approaches that can produce approximately uniform samples are the Metropolis-Hasting random walk (MHRW) and a re-weighted random walk (RWRW). Both have pros and cons, which we demonstrate through a comparison to each other as well as to the "ground truth." In contrast, using Breadth-First-Search (BFS) or an unadjusted Random Walk (RW) leads to substantially biased results. Second, and in addition to offline performance assessment, we introduce online formal convergence diagnostics to assess sample quality during the data collection process. We show how these diagnostics can be used to effectively determine when a random walk sample is of adequate size and quality. Third, as a case study, we apply the above methods to Facebook and we collect the first, to the best of our knowledge, representative sample of Facebook users. We make it publicly available and employ it to characterize several key properties of Facebook.
引用
收藏
页码:1872 / 1892
页数:21
相关论文
共 50 条
  • [1] Crawling Online Social Networks
    Erlandsson, Fredrik
    Niat, Roozbeh
    Boldt, Martin
    Johnson, Henric
    Wu, S. Felix
    SECOND EUROPEAN NETWORK INTELLIGENCE CONFERENCE (ENIC 2015), 2015, : 9 - 16
  • [2] Multitenant approach to crawling of online social networks
    Butakov, Nikolay
    Petrov, Maxim
    Radice, Anton
    5TH INTERNATIONAL YOUNG SCIENTIST CONFERENCE ON COMPUTATIONAL SCIENCE, YSC 2016, 2016, 101 : 115 - 124
  • [3] Friendship Recommendations in Online Social Networks
    Carullo, Giuliana
    Castiglione, Aniello
    De Santis, Alfredo
    2014 INTERNATIONAL CONFERENCE ON INTELLIGENT NETWORKING AND COLLABORATIVE SYSTEMS (INCOS), 2014, : 42 - 48
  • [4] Crawling and Detecting Community Structure in Online Social Networks Using Local Information
    Blenn, Norbert
    Doerr, Christian
    Van Kester, Bas
    Van Mieghem, Piet
    NETWORKING 2012, PT I, 2012, 7289 : 56 - 67
  • [5] Travel Routes Recommendations via Online Social Networks
    Comito, Carmela
    PROCEEDINGS OF THE 2019 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING (ASONAM 2019), 2019, : 1168 - 1173
  • [6] Time-aware recommendations in online social networks
    Xing, Xing
    Zhang, Weishi
    Zhang, Xiuguo
    Jia, Zhichun
    Xu, Nan
    Journal of Computational Information Systems, 2013, 9 (10): : 4155 - 4162
  • [7] Exploiting social capital for improving personalized recommendations in online social networks
    de Souza, Paulo Roberto
    Durao, Frederico Araujo
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 246
  • [8] Practical Privacy-Preserving Friend Recommendations on Social Networks
    Brendel, William
    Han, Fangqiu
    Marujo, Luis
    Jie, Luo
    Korolova, Aleksandra
    COMPANION PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE 2018 (WWW 2018), 2018, : 111 - 112
  • [9] Representation of Rules for Relevant Recommendations to Online Social Networks Users
    Bouraga, Sarah
    Jureta, Ivan
    Faulkner, Stephane
    SECOND INTERNATIONAL WORKSHOP ON ARTIFICIAL INTELLIGENCE FOR REQUIREMENTS ENGINEERING (AIRE 2015), 2015, : 33 - 40
  • [10] Crawling Credible Online Medical Sentiments for Social Intelligence
    Abbasi, Ahmed
    Fu, Tianjun
    Zeng, Daniel
    Adjeroh, Donald
    2013 ASE/IEEE INTERNATIONAL CONFERENCE ON SOCIAL COMPUTING (SOCIALCOM), 2013, : 254 - 263