Chinese cyber-violent Speech Detection and Analysis Based on Pre-trained Model

被引:0
|
作者
Zhou, Sunrui [1 ]
机构
[1] Shanghai Univ, Shanghai, Peoples R China
关键词
Chinese cyber-violent speech; BERT; Hanyu Pinyin; Emotion;
D O I
10.1145/3670105.3670179
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Cyber-violent speech is prevalent on Chinese social platforms today, and traditional manual moderation by platform administrators is no longer effective in detecting and analyzing it. Therefore, the use of artificial intelligence technologies like natural language processing for automated detection on the Internet is an essential requirement to promptly prevent the spread of cyber-violent speech. Due to the covert and diverse nature of cyber-violent speech, existing models have shown unsatisfactory performance in detecting implicitly expressed violent speech. This paper proposes a violence speech detection method based on BERT and Hanyu Pinyin and emotion assistance, and its effectiveness and advancement are validated on multiple datasets. Subsequently, the experimental results are analyzed to summarize the characteristics of Chinese violent speech, facilitating further development in violence speech detection efforts in the future.
引用
收藏
页码:443 / 447
页数:5
相关论文
共 50 条
  • [21] Automatic Prosody Annotation with Pre-Trained Text-Speech Model
    Dai, Ziqian
    Yu, Jianwei
    Wang, Yan
    Chen, Nuo
    Bian, Yanyao
    Li, Guangzhi
    Cai, Deng
    Yu, Dong
    INTERSPEECH 2022, 2022, : 5513 - 5517
  • [22] Automatic Speech Recognition Dataset Augmentation with Pre-Trained Model and Script
    Kwon, Minsu
    Choi, Ho-Jin
    2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP), 2019, : 649 - 651
  • [23] Detection of Unstructured Sensitive Data Based on a Pre-Trained Model and Lattice Transformer
    Jin, Feng
    Wu, Shaozhi
    Liu, Xingang
    Su, Han
    Tian, Miao
    2024 7TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND BIG DATA, ICAIBD 2024, 2024, : 180 - 185
  • [24] Conditional pre-trained attention based Chinese question generation
    Zhang, Liang
    Fang, Ligang
    Fan, Zheng
    Li, Wei
    An, Jing
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2021, 33 (20):
  • [25] Pre-trained language model augmented adversarial training network for Chinese clinical event detection
    Zhang, Zhichang
    Zhang, Minyu
    Zhou, Tong
    Qiu, Yanlong
    MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2020, 17 (04) : 2825 - 2841
  • [26] PLLM-CS: Pre-trained Large Language Model (LLM) for cyber threat detection in satellite networks
    Hassanin, Mohammed
    Keshk, Marwa
    Salim, Sara
    Alsubaie, Majid
    Sharma, Dharmendra
    AD HOC NETWORKS, 2025, 166
  • [27] Pre-trained models for detection and severity level classification of dysarthria from speech
    Javanmardi, Farhad
    Kadiri, Sudarsana Reddy
    Alku, Paavo
    SPEECH COMMUNICATION, 2024, 158
  • [28] Lawformer: A pre-trained language model for Chinese legal long documents
    Xiao, Chaojun
    Hu, Xueyu
    Liu, Zhiyuan
    Tu, Cunchao
    Sun, Maosong
    AI OPEN, 2021, 2 : 79 - 84
  • [29] Detection of Speech Related Disorders by Pre-trained Embedding Models Extracted Biomarkers
    Jenei, Attila Zoltan
    Kiss, Gabor
    Sztaho, David
    SPEECH AND COMPUTER, SPECOM 2022, 2022, 13721 : 279 - 289
  • [30] Using Noise and External Knowledge to Enhance Chinese Pre-trained Model
    Ma, Haoyang
    Li, Zeyu
    Guo, Hongyu
    2022 IEEE 34TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, ICTAI, 2022, : 476 - 480