Mining the key predictors for event outbreaks in social networks

被引:11
|
作者
Yi, Chengqi [1 ]
Bao, Yuanyuan [2 ,3 ]
Xue, Yibo [2 ,3 ]
机构
[1] Harbin Univ Sci & Technol, Sch Comp Sci & Technol, Harbin 150080, Peoples R China
[2] Tsinghua Univ, Res Inst Informat Technol, FIT Bldg 3-418, Beijing 100084, Peoples R China
[3] Tsinghua Univ, Tsinghua Natl Lab Informat Sci & Technol, Beijing 100084, Peoples R China
关键词
Social network; Outbreak prediction; Information dissemination; Predictors; Data-driven; INFORMATION PROPAGATION; DIFFUSION;
D O I
10.1016/j.physa.2015.12.019
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
It will be beneficial to devise a method to predict a so-called event outbreak. Existing works mainly focus on exploring effective methods for improving the accuracy of predictions, while ignoring the underlying causes: What makes event go viral? What factors that significantly influence the prediction of an event outbreak in social networks? In this paper, we proposed a novel definition for an event outbreak, taking into account the structural changes to a network during the propagation of content. In addition, we investigated features that were sensitive to predicting an event outbreak. In order to investigate the universality of these features at different stages of an event, we split the entire lifecycle of an event into 20 equal segments according to the proportion of the propagation time. We extracted 44 features, including features related to content, users, structure, and time, from each segment of the event. Based on these features, we proposed a prediction method using supervised classification algorithms to predict event outbreaks. Experimental results indicate that, as time goes by, our method is highly accurate, with a precision rate ranging from 79% to 97% and a recall rate ranging from 74% to 97%. In addition, after applying a feature-selection algorithm, the top five selected features can considerably improve the accuracy of the prediction. Data-driven experimental results show that the entropy of the eigenvector centrality, the entropy of the PageRank, the standard deviation of the betweenness centrality, the proportion of re-shares without content, and the average path length are the key predictors for an event outbreak. Our findings are especially useful for further exploring the intrinsic characteristics of outbreak prediction. (C) 2015 Elsevier B.V. All rights reserved.
引用
收藏
页码:247 / 260
页数:14
相关论文
共 50 条
  • [21] Can the Content of Social Networks Explain Epidemic Outbreaks?
    Alexandre Gori Maia
    Jose Daniel Morales Martinez
    Leticia Junqueira Marteleto
    Cristina Guimaraes Rodrigues
    Luiz Gustavo Sereno
    Population Research and Policy Review, 2023, 42
  • [22] Modelling disease outbreaks in realistic urban social networks
    Eubank, S
    Guclu, H
    Kumar, VSA
    Marathe, MV
    Srinivasan, A
    Toroczkai, Z
    Wang, N
    NATURE, 2004, 429 (6988) : 180 - 184
  • [23] Mining Key Event Sequences for Detecting Fraudulent Users on Web
    Mao, G. J.
    INTERNATIONAL CONFERENCE ON ADVANCED MANAGEMENT SCIENCE AND INFORMATION ENGINEERING (AMSIE 2015), 2015, : 619 - 626
  • [24] Event Localization and Visualization in Social Networks
    Giridhar, Prasanna
    Abdelzaher, Tarek
    George, Jemin
    Kaplan, Lance
    2015 IEEE CONFERENCE ON COMPUTER COMMUNICATIONS WORKSHOPS (INFOCOM WKSHPS), 2015, : 35 - 36
  • [25] Invariant Event Tracking on Social Networks
    Unankard, Sayan
    Li, Xue
    Long, Guodong
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2015, PT II, 2015, 9050 : 517 - 521
  • [26] Mining Attributed Interaction Networks on Industrial Event Logs
    Atzmueller, Martin
    Kloepper, Benjamin
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING (IDEAL 2018), PT II, 2018, 11315 : 94 - 102
  • [27] CULTURAL DATA MINING IN SOCIAL NETWORKS
    Papaioannou, Evi
    Schiza, Elpida
    4TH INTERNATIONAL CONFERENCE ON EDUCATION AND SOCIAL SCIENCES (INTCESS 2017), 2017, : 411 - 420
  • [28] A Methodology for Social Networks Analysis and Mining
    Amato, Flora
    Cozzolino, Giovanni
    Moscato, Vincenzo
    Picariello, Antonio
    Sperli, Giancarlo
    ADVANCES ON P2P, PARALLEL, GRID, CLOUD AND INTERNET COMPUTING (3PGCIC-2017), 2018, 13 : 683 - 691
  • [29] Mining diversity on social media networks
    Liu, Lu
    Zhu, Feida
    Jiang, Meng
    Han, Jiawei
    Sun, Lifeng
    Yang, Shiqiang
    MULTIMEDIA TOOLS AND APPLICATIONS, 2012, 56 (01) : 179 - 205
  • [30] A survey on text mining in social networks
    Irfan, Rizwana
    King, Christine K.
    Grages, Daniel
    Ewen, Sam
    Khan, Samee U.
    Madani, Sajjad A.
    Kolodziej, Joanna
    Wang, Lizhe
    Chen, Dan
    Rayes, Ammar
    Tziritas, Nikolaos
    Xu, Cheng-Zhong
    Zomaya, Albert Y.
    Alzahrani, Ahmed Saeed
    Li, Hongxiang
    KNOWLEDGE ENGINEERING REVIEW, 2015, 30 (02): : 157 - 170