Evaluating the Representativeness in the Geographic Distribution of Twitter User Population

被引:7
|
作者
Yin, Junjun [1 ]
Chi, Guangqing [2 ]
Van Hook, Jennifer [3 ]
机构
[1] Penn State Univ, Social Sci Res Inst, State Coll, PA 16801 USA
[2] Penn State Univ, Dept Agr Econ Sociol & Educ, State Coll, PA USA
[3] Penn State Univ, Dept Sociol & Criminol, State Coll, PA USA
来源
PROCEEDINGS OF THE 12TH WORKSHOP ON GEOGRAPHIC INFORMATION RETRIEVAL (GIR'18) | 2018年
关键词
Geo-tagged Tweets; Demographics; Bias; Representativeness; Geographic Distribution;
D O I
10.1145/3281354.3281360
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Twitter data are becoming a Big Data stream and have drawn multidisciplinary interests to study population characteristics and social problems that cannot be measured well by traditional surveys. However, the use of Twitter data has been strongly resisted because of concerns about the representativeness of the population as we know little about the demographic characters of the users. It is critical to evaluate the extent to which Twitter users represent the population across different demographic groups. This study evaluates the representativeness and examines the geographic distributions of Twitter user population and its correspondence to the real population. By estimating Twitter user demographics for the contiguous U.S. in 2014, the preliminary results revealed both over- and under-representation of certain demographic groups against the real population at county-level. A representation index is used to assess the representativeness of Twitter samples geographically, which may help further studies to identify the determinants of biases.
引用
收藏
页数:2
相关论文
共 50 条
  • [41] TUCAN: Twitter User Centric ANalyzer
    Grimaudo, Luigi
    Song, Han
    Baldi, Mario
    Mellia, Marco
    Munafo, Maurizio
    2013 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING (ASONAM), 2013, : 1455 - 1457
  • [42] TwitterMancer: Predicting User Interactions on Twitter
    Sotiropoulos, Konstantinos
    Byers, John W.
    Pratikakis, Polyvios
    Tsourakakis, Charalampos E.
    2019 57TH ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2019, : 973 - 980
  • [43] Twitter User Classification with Posting Locations
    Takeda, Naoto
    Seki, Yohei
    DIGITAL LIBRARIES: KNOWLEDGE, INFORMATION, AND DATA IN AN OPEN ACCESS SOCIETY, 2016, 10075 : 297 - 310
  • [44] Analyzing User Retweet Behavior on Twitter
    Xu, Zhiheng
    Yang, Qing
    2012 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING (ASONAM), 2012, : 46 - 50
  • [45] Academic information on Twitter: A user survey
    Mohammadi, Ehsan
    Thelwall, Mike
    Kwasny, Mary
    Holmes, Kristi L.
    PLOS ONE, 2018, 13 (05):
  • [46] Polarized User and Topic Tracking in Twitter
    Coletto, Mauro
    Lucchese, Claudio
    Orlando, Salvatore
    Perego, Raffaele
    SIGIR'16: PROCEEDINGS OF THE 39TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2016, : 945 - 948
  • [47] Evaluating the impact of geographic distribution of photovoltaic self-consumption on energy losses
    Garcia-Villalobos, J.
    Eguia, P.
    Torres, E.
    Etxegarai, A.
    2017 IEEE MANCHESTER POWERTECH, 2017,
  • [48] Twitter User Recommendation for Gaining Followers
    Corcoglioniti, Francesco
    Nechaev, Yaroslav
    Giuliano, Claudio
    Zanoli, Roberto
    AI*IA 2018 - ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, 11298 : 539 - 552
  • [49] Evaluating the geographic distribution of plants in Utah from the Atlas of Vascular Plants of Utah
    Ramsey, RD
    Shultz, L
    WESTERN NORTH AMERICAN NATURALIST, 2004, 64 (04) : 421 - 432
  • [50] Twitter and Facebook for User Collection Requests
    Petit, Joan
    COLLECTION MANAGEMENT, 2011, 36 (04) : 253 - 258