Distributed Sentiment Analysis for Geo-Tagged Twitter Data

被引:0
|
作者
Zengin, Muhammed Said [1 ]
Arslan, Rabia [1 ]
Akgun, Mehmet Burak [1 ]
机构
[1] TOBB Ekon & Teknol Univ, Bilgisayar Muhendisligi Bolumu, Ankara, Turkey
关键词
Big data; distributed data processing; sentiment analysis; BERT;
D O I
10.1109/SIU55565.2022.9864702
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The ever-increasing frequency of sharing on social media makes these platforms one of the primary sources of data for computational social science studies. Similarly, examining and analyzing large scale social media data-sets is crucial for governments as well as companies. However, as the amount of data increases, insights that need to be derived from the data using artificial intelligence based models becomes more and more demanding in terms of processing power. In fact, hardware requirements might dramatically increase if the insights are needed under real-time or near-real time constraints. In this study, we developed a distributed sentiment analysis model that utilizes a large social media data-set. 16 million tweets have been collected and grouped by the originating city. The sentiment analysis model was produced by fine-tuning the pre-trained BERT model. Distributed big data analytics engine, Apache Spark, is used to execute the trained model in a distributed fashion. For evaluation purposes, the prediction time on a single compute unit is compared with the distributed prediction time. Sentiment analysis model has been executed separately for each of the data-groups corresponding to 81 provinces. The data-set containing 16 million tweets used in this study, the Turkish sentiment analysis model produced, the distributed prediction code developed for Apache Spark and all the results of the study can be accessed from the address https://distributed-sentiment-analysis.github.io/.
引用
收藏
页数:4
相关论文
共 50 条
  • [21] Exploration of geo-tagged photos through data mining approaches
    Lee, Ickjai
    Cai, Guochen
    Lee, Kyungmi
    EXPERT SYSTEMS WITH APPLICATIONS, 2014, 41 (02) : 397 - 405
  • [22] Geo-Tagged Social Media Data as a Proxy for Urban Mobility
    Qian, Cheng
    Kats, Philipp
    Malinchik, Sergey
    Hoffman, Mark
    Kettler, Brian
    Kontokosta, Constantine
    Sobolevsky, Stanislav
    ADVANCES IN CROSS-CULTURAL DECISION MAKING, (AHFE 2017), 2018, 610 : 29 - 40
  • [23] Distributed messaging and light streaming system for combating pandemics A case study on spatial analysis of COVID-19 Geo-tagged Twitter dataset
    Ozguven, Yavuz Melih
    Eken, Suleyman
    JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2021, 14 (2) : 773 - 787
  • [24] Distributed messaging and light streaming system for combating pandemicsA case study on spatial analysis of COVID-19 Geo-tagged Twitter dataset
    Yavuz Melih Özgüven
    Süleyman Eken
    Journal of Ambient Intelligence and Humanized Computing, 2023, 14 : 773 - 787
  • [25] Urban Activity Summarization with Geo-Tagged Social Media Data
    Jiang, Jing
    Wang, Chunhui
    Tian, Yu
    Zhang, Shaoyao
    Zhao, Yan
    2018 4TH INTERNATIONAL CONFERENCE ON COMPUTER AND TECHNOLOGY APPLICATIONS (ICCTA), 2018, : 16 - 19
  • [26] A Twitter-Based Culture Visualization System by Analyzing Multilingual Geo-Tagged Tweets
    Wang, Yuanyuan
    Siriaraya, Panote
    Nakaoka, Yusuke
    Sakata, Haruka
    Kawai, Yukiko
    Akiyama, Toyokazu
    MATURITY AND INNOVATION IN DIGITAL LIBRARIES, ICADL 2018, 2018, 11279 : 147 - 150
  • [27] Sentiment Analysis On Twitter Data Using Distributed Architecture
    Karhan, Zebra
    Soysaldi, Meryem
    Ozben, Yagiz Ozgenc
    Kilic, Erdal
    2018 3RD INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ENGINEERING (UBMK), 2018, : 357 - 360
  • [28] Location Disambiguation for Geo-tagged Images
    Zhu, Zhu
    Shou, Lidan
    Mao, Kuang
    Chen, Gang
    PROCEEDINGS OF THE 34TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR'11), 2011, : 1165 - 1166
  • [29] CONSTRUCTING A LANDMARK IDENTIFICATION SYSTEM FOR GEO-TAGGED PHOTOGRAPHS BASED ON WEB DATA ANALYSIS
    Hoashi, Keiichiro
    Uemukai, Toshiaki
    Matsumoto, Kazunori
    Takishima, Yasuhiro
    ICME: 2009 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-3, 2009, : 606 - 609
  • [30] Data and Resources Paper: A Multi-granularity Decade-Long Geo-Tagged Twitter Dataset for Spatial Computing
    Feng, Yunhe
    Meng, Zexuan
    Clemmer, Colton
    Fan, Heng
    Huang, Yan
    31ST ACM SIGSPATIAL INTERNATIONAL CONFERENCE ON ADVANCES IN GEOGRAPHIC INFORMATION SYSTEMS, ACM SIGSPATIAL GIS 2023, 2023, : 630 - 633