Cyberbullying detection in social media text based on character-level convolutional neural network with shortcuts

被引:35
|
作者
Lu, Nijia [1 ]
Wu, Guohua [1 ]
Zhang, Zhen [1 ,4 ]
Zheng, Yitao [1 ]
Ren, Yizhi [1 ]
Choo, Kim-Kwang Raymond [2 ,3 ]
机构
[1] Hangzhou Dianzi Univ, Sch Cyberspace, Hangzhou, Zhejiang, Peoples R China
[2] Univ Texas San Antonio, Dept Informat Syst & Cyber Secur, San Antonio, TX USA
[3] Univ Texas San Antonio, Dept Elect & Comp Engn, San Antonio, TX USA
[4] 1158,2 St,Baiyang St, Hangzhou 310018, Zhejiang, Peoples R China
来源
基金
中国国家自然科学基金;
关键词
convolutional neural networks; cyberbullying detection; social network; text classification;
D O I
10.1002/cpe.5627
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
As people spend increasingly more time on social networks, cyberbullying has become a social problem that needs to be solved by machine learning methods. Our research focuses on textual cyberbullying detection because text is the most common form of social media. However, the content information in social media is short, noisy, and unstructured with incorrect spellings and symbols, and this impacts the performance of some traditional machine learning methods based on vocabulary knowledge. For this reason, we propose a Char-CNNS (Character-level Convolutional Neural Network with Shortcuts) model to identify whether the text in social media contains cyberbullying. We use characters as the smallest unit of learning, enabling the model to overcome spelling errors and intentional obfuscation in real-world corpora. Shortcuts are utilized to stitch different levels of features to learn more granular bullying signals, and a focal loss function is adopted to overcome the class imbalance problem. We also provide a new Chinese Weibo comment dataset specifically for cyberbullying detection, and experiments are performed on both the Chinese Weibo dataset and the English Tweet dataset. The experimental results show that our approach is competitive with state-of-the-art techniques on cyberbullying detection task.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Character-level Intrusion Detection Based on Convolutional Neural Networks
    Lin, Steven Z.
    Shi, Yong
    Xue, Zhi
    2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,
  • [2] A Complaint Text Classification Model Based on Character-level Convolutional Network
    Tong, Xuesong
    Wu, Bin
    Wang, Shuyang
    Lv, Jinna
    PROCEEDINGS OF 2018 IEEE 9TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE (ICSESS), 2018, : 507 - 511
  • [3] Character-Level Convolutional Neural Network for Paraphrase Detection and Other Experiments
    Maraev, Vladislav
    Saedi, Chakaveh
    Rodrigues, Joao
    Branco, Antonio
    Silva, Joao
    ARTIFICIAL INTELLIGENCE AND NATURAL LANGUAGE, 2018, 789 : 293 - 304
  • [4] Character-level text classification via convolutional neural network and gated recurrent unit
    Bing Liu
    Yong Zhou
    Wei Sun
    International Journal of Machine Learning and Cybernetics, 2020, 11 : 1939 - 1949
  • [5] A Character-level Convolutional Neural Network with Dynamic Input Length for Thai Text Categorization
    Koomsubha, Thanabhat
    Vateekul, Peerapon
    2017 9TH INTERNATIONAL CONFERENCE ON KNOWLEDGE AND SMART TECHNOLOGY (KST), 2017, : 101 - 105
  • [6] Character-level text classification via convolutional neural network and gated recurrent unit
    Liu, Bing
    Zhou, Yong
    Sun, Wei
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2020, 11 (08) : 1939 - 1949
  • [7] Text Classification and Transfer Learning Based on Character-Level Deep Convolutional Neural Networks
    Sato, Minato
    Orihara, Ryohei
    Sei, Yuichi
    Tahara, Yasuyuki
    Ohsuga, Akihiko
    AGENTS AND ARTIFICIAL INTELLIGENCE (ICAART 2017), 2018, 10839 : 62 - 81
  • [8] A Character-Level Convolutional Neural Network for Predicting Exploitability of Vulnerability
    Lyu, Jinghui
    Bai, Yude
    Xing, Zhenchang
    Li, Xiaohong
    Ge, Weimin
    2021 INTERNATIONAL SYMPOSIUM ON THEORETICAL ASPECTS OF SOFTWARE ENGINEERING (TASE 2021), 2021, : 119 - 126
  • [9] Character-level Convolutional Networks for Text Classification
    Zhang, Xiang
    Zhao, Junbo
    Yann Lecun
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28
  • [10] Improving Bug Localization with Character-level Convolutional Neural Network and Recurrent Neural Network
    Xiao, Yan
    Keung, Jacky
    2018 25TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE (APSEC 2018), 2018, : 703 - 704