Sampling Representative Users from Large Social Networks

被引:0
|
作者
Tang, Jie [1 ,2 ]
Zhang, Chenhui [1 ,2 ]
Cai, Keke [3 ]
Zhang, Li [3 ]
Su, Zhong [3 ]
机构
[1] Tsinghua Univ, Dept Comp Sci & Technol, Beijing, Peoples R China
[2] TNList, Beijing, Peoples R China
[3] IBM Corp, China Res Lab, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Finding a subset of users to statistically represent the original social network is a fundamental issue in Social Network Analysis (SNA). The problem has not been extensively studied in existing literature. In this paper, we present a formal definition of the problem of sampling representative users from social network. We propose two sampling models and theoretically prove their NP-hardness. To efficiently solve the two models, we present an efficient algorithm with provable approximation guarantees. Experimental results on two datasets show that the proposed models for sampling representative users significantly outperform (+6%-23% in terms of Precision@100) several alternative methods using authority or structure information only. The proposed algorithms are also effective in terms of time complexity. Only a few seconds are needed to sampling 300 representative users from a network of 100,000 users. All data and codes are publicly available.(1)
引用
收藏
页码:304 / 310
页数:7
相关论文
共 50 条
  • [11] A Monte Carlo sampling method for drawing representative samples from large databases
    Guo, H
    Hou, WC
    Yan, F
    Zhu, Q
    16TH INTERNATIONAL CONFERENCE ON SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT, PROCEEDINGS, 2004, : 419 - 420
  • [12] FROM USERS' MOTIVATIONS TO BRANDING: THE CASE OF SOCIAL NETWORKS
    Andrei, Andreia Gabriela
    Iacob, Amalasunta
    PROCEEDINGS OF THE IVTH INTERNATIONAL CONFERENCE ON GLOBALIZATION AND HIGHER EDUCATION IN ECONOMICS AND BUSINESS ADMINISTRATION - GEBA 2010, 2011, : 139 - 144
  • [13] Estimating Influence of Social Media Users from Sampled Social Networks
    Kimura, Kazuma
    Tsugawa, Sho
    PROCEEDINGS OF THE 2016 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING ASONAM 2016, 2016, : 1302 - 1308
  • [14] Sampling informative patterns from large single networks
    Chehreghani, Mostafa Haghir
    Abdessalem, Talel
    Bifet, Albert
    Bouzbila, Meriem
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2020, 106 : 653 - 658
  • [15] On Sampling Type Distribution from Heterogeneous Social Networks
    Li, Jhao-Yin
    Yeh, Mi-Yen
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PT II: 15TH PACIFIC-ASIA CONFERENCE, PAKDD 2011, 2011, 6635 : 111 - 122
  • [16] Network Sampling with Memory: A Proposal for More Efficient Sampling from Social Networks
    Mouw, Ted
    Verdery, Ashton M.
    SOCIOLOGICAL METHODOLOGY 2012, VOL 42, 2012, 42 : 206 - 256
  • [17] Visualizing the evolution of users' profiles from online social networks
    Tchuente, Dieudonne
    Canut, Marie-Francoise
    Jessel, Nadine Baptiste
    Peninou, Andre
    El Haddadi, Anass
    2010 INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING (ASONAM 2010), 2010, : 370 - 374
  • [18] Sampling Based Katz Centrality Estimation for Large-Scale Social Networks
    Lin, Mingkai
    Li, Wenzhong
    Nguyen, Cam-tu
    Wang, Xiaoliang
    Lu, Sanglu
    ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2019, PT II, 2020, 11945 : 584 - 598
  • [19] Sampling Online Social Networks
    Papagelis, Manos
    Das, Gautam
    Koudas, Nick
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2013, 25 (03) : 662 - 676
  • [20] Sampling Online Social Networks Using Coupling From The Past
    White, Kenton
    Li, Guichong
    Japkowicz, Nathalie
    12TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2012), 2012, : 266 - 272