Real-Time Social Media Analytics with Deep Transformer Language Models: A Big Data Approach

被引:6
|
作者
Ahmet, Ahmed [1 ]
Abdullah, Tariq [1 ]
机构
[1] Univ Derby, Dept Comp Sci, Derby, England
关键词
Real-time analytics; Social media; deep learning; machine learning; transfer learning; big data;
D O I
10.1109/BigDataSE50710.2020.00014
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Utilisation of transfer learning with deep language models is regarded as one of the most important developments in deep learning. Their application on real-time high-velocity and volume user-generated data has been elusive due to the unprecedented size and complexity of the models which result in substantial computational overhead. Recent iterations of these architectures have produced significantly distilled models with state-of-the-art performance and reduced resource requirement. We utilize deep transformer language models on user-generated data alongside a robust text normalization pipeline to address what is considered as the Achilles heel of deep learning on user-generated text data, namely data normalization. In this paper, we propose a framework for the ingestion, analysis and storage of real-time data streams. A case study in sentiment analysis and offensive/hateful language detection is used to evaluate the framework. We demonstrate inference on a large Twitter dataset using CPU and GPU clusters, highlighting the viability of the fine-tuned distilled language model for high volume data. Fine-tuned model significantly outperforms previous state-of-the-art on several benchmark datasets, providing a powerful model that can be utilized for a variety of downstream tasks. To our knowledge, this is the only study demonstrating powerful transformer language models for real-time social media stream analytics in a distributed setting.
引用
收藏
页码:41 / 48
页数:8
相关论文
共 50 条
  • [21] Social Media Analytics Based on Big Data
    Shaikh, Farzana
    Rangrez, Firdaus
    Khan, Afsha
    Shaikh, Uzma
    PROCEEDINGS OF 2017 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL (I2C2), 2017,
  • [22] Social media big data analytics: A survey
    Ghani, Norjihan Abdul
    Hamid, Suraya
    Hashem, Ibrahim Abaker Targio
    Ahmed, Ejaz
    COMPUTERS IN HUMAN BEHAVIOR, 2019, 101 : 417 - 428
  • [23] A Framework for Real-time Sentiment Analysis of Big Data Generated by Social Media Platforms
    Fahd, Kiran
    Parvin, Sazia
    de Souza-Daw, Anthony
    2021 31ST INTERNATIONAL TELECOMMUNICATION NETWORKS AND APPLICATIONS CONFERENCE (ITNAC), 2021, : 30 - 33
  • [24] Real-Time Big Data Architecture for Processing Cryptocurrency and Social Media Data: A Clustering Approach Based on k-Means
    Barradas, Adrian
    Tejeda-Gil, Acela
    Canton-Croda, Rosa-Maria
    ALGORITHMS, 2022, 15 (05)
  • [25] REAL-TIME BIG DATA ANALYTICS FRAMEWORK WITH DATA BLENDING APPROACH FOR MULTIPLE DATA SOURCES IN SMART CITY APPLICATIONS
    Manjunatha, S.
    Annappa, B.
    SCALABLE COMPUTING-PRACTICE AND EXPERIENCE, 2020, 21 (04): : 611 - 623
  • [26] Big Data Analytics for Real Time Dispatch
    Mogra, Himanshu
    Segu, SaiNikhil
    DeLong, James
    Canales-Vaschy, Remy
    Ramakrishnan, Srikanth
    Sridharan, Sriram
    Penumutchu, Srikanth
    2024 35TH ANNUAL SEMI ADVANCED SEMICONDUCTOR MANUFACTURING CONFERENCE, ASMC, 2024,
  • [27] Improving hearing healthcare with Big Data analytics of real-time hearing aid data
    Christensen, Jeppe H.
    Pontoppidan, Niels H.
    Anisetti, Marco
    Bellandi, Valerio
    Cremonini, Marco
    2019 IEEE WORLD CONGRESS ON SERVICES (IEEE SERVICES 2019), 2019, : 307 - 313
  • [28] Real-Time Bigdata Analytics: A Stream Data Mining Approach
    Tidke, Bharat
    Mehta, Rupa G.
    Dhanani, Jenish
    RECENT FINDINGS IN INTELLIGENT COMPUTING TECHNIQUES, VOL 2, 2018, 708 : 345 - 351
  • [29] Using a Rich Context Model for Real-Time Big Data Analytics in Twitter
    Sotsenko, Alisa
    Jansen, Marc
    Milrad, Marcelo
    Rana, Juwel
    2016 IEEE 4TH INTERNATIONAL CONFERENCE ON FUTURE INTERNET OF THINGS AND CLOUD WORKSHOPS (FICLOUDW), 2016, : 228 - 233
  • [30] Real-time QoS Monitoring for Big Data Analytics in Mobile Environment: an Overview
    Xiao, Fang
    Wainaina, Paul
    2016 INTERNATIONAL CONGRESS ON COMPUTATION ALGORITHMS IN ENGINEERING (ICCAE 2016), 2016, : 26 - 30