Real-Time Social Media Analytics with Deep Transformer Language Models: A Big Data Approach

被引:6
|
作者
Ahmet, Ahmed [1 ]
Abdullah, Tariq [1 ]
机构
[1] Univ Derby, Dept Comp Sci, Derby, England
关键词
Real-time analytics; Social media; deep learning; machine learning; transfer learning; big data;
D O I
10.1109/BigDataSE50710.2020.00014
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Utilisation of transfer learning with deep language models is regarded as one of the most important developments in deep learning. Their application on real-time high-velocity and volume user-generated data has been elusive due to the unprecedented size and complexity of the models which result in substantial computational overhead. Recent iterations of these architectures have produced significantly distilled models with state-of-the-art performance and reduced resource requirement. We utilize deep transformer language models on user-generated data alongside a robust text normalization pipeline to address what is considered as the Achilles heel of deep learning on user-generated text data, namely data normalization. In this paper, we propose a framework for the ingestion, analysis and storage of real-time data streams. A case study in sentiment analysis and offensive/hateful language detection is used to evaluate the framework. We demonstrate inference on a large Twitter dataset using CPU and GPU clusters, highlighting the viability of the fine-tuned distilled language model for high volume data. Fine-tuned model significantly outperforms previous state-of-the-art on several benchmark datasets, providing a powerful model that can be utilized for a variety of downstream tasks. To our knowledge, this is the only study demonstrating powerful transformer language models for real-time social media stream analytics in a distributed setting.
引用
收藏
页码:41 / 48
页数:8
相关论文
共 50 条
  • [31] Using Big Data and Real-Time Analytics to Support Smart City Initiatives
    Souza, Arthur
    Figueredo, Mickael
    Cacho, Nelio
    Araujo, Daniel
    Prolo, Carlos A.
    IFAC PAPERSONLINE, 2016, 49 (30): : 257 - 262
  • [32] Real-time big data analytics for hard disk drive predictive maintenance
    Su, Chuan-Jun
    Huang, Shi-Feng
    COMPUTERS & ELECTRICAL ENGINEERING, 2018, 71 : 93 - 101
  • [33] Towards Real-Time Road Traffiic Analytics using Telco Big Data
    Costa, Constantinos
    Chatzimilioudis, Georgios
    Zeinalipour-Yazti, Demetrios
    Mokbel, Mohamed F.
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL WORKSHOP ON REAL-TIME BUSINESS INTELLIGENCE AND ANALYTICS, 2017,
  • [34] The growing role of integrated and insightful big and real-time data analytics platforms
    Ranganathan, Indrakumari
    Thangamuthu, Poongodi
    Palanimuthu, Suresh
    Balusamy, Balamurugan
    DIGITAL TWIN PARADIGM FOR SMARTER SYSTEMS AND ENVIRONMENTS: THE INDUSTRY USE CASES, 2020, 117 : 165 - 186
  • [35] Big Cross-Modal Social Media Data Analytics With Deep Intelligence Introduction
    Wang, Yang
    Fang, Meng
    Zhou, Joey Tianyi
    Mu, Tingting
    Tao, Dacheng
    IEEE MULTIMEDIA, 2020, 27 (04) : 6 - 8
  • [36] A SURVEY ON BIG DATA ANALYTICS USING SOCIAL MEDIA DATA
    Paul, P. Victer
    Monica, K.
    Trishanka, M.
    2017 INNOVATIONS IN POWER AND ADVANCED COMPUTING TECHNOLOGIES (I-PACT), 2017,
  • [37] A Survey of Social Media, Big Data, Data Mining, and Analytics
    Oliverio, Jared
    JOURNAL OF INDUSTRIAL INTEGRATION AND MANAGEMENT-INNOVATION AND ENTREPRENEURSHIP, 2018, 3 (03):
  • [38] GPGPU for Real-Time Data Analytics
    He, Bingsheng
    Huynh Phung Huynh
    Mong, Rick Goh Siow
    PROCEEDINGS OF THE 2012 IEEE 18TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS 2012), 2012, : 945 - +
  • [39] Data Systems Fault Coping for Real-time Big Data Analytics Required Architectural Crucibles
    Cohen, Stephen
    Money, William
    PROCEEDINGS OF THE 50TH ANNUAL HAWAII INTERNATIONAL CONFERENCE ON SYSTEM SCIENCES, 2017, : 1023 - 1032
  • [40] Big Data Real-Time Clickstream Data Ingestion Paradigm for E-Commerce Analytics
    Pal, Gautam
    Li, Gangmin
    Atkinson, Katie
    2018 4TH INTERNATIONAL CONFERENCE FOR CONVERGENCE IN TECHNOLOGY (I2CT), 2018,