Chinese cyber-violent Speech Detection and Analysis Based on Pre-trained Model

被引:0
|
作者
Zhou, Sunrui [1 ]
机构
[1] Shanghai Univ, Shanghai, Peoples R China
关键词
Chinese cyber-violent speech; BERT; Hanyu Pinyin; Emotion;
D O I
10.1145/3670105.3670179
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Cyber-violent speech is prevalent on Chinese social platforms today, and traditional manual moderation by platform administrators is no longer effective in detecting and analyzing it. Therefore, the use of artificial intelligence technologies like natural language processing for automated detection on the Internet is an essential requirement to promptly prevent the spread of cyber-violent speech. Due to the covert and diverse nature of cyber-violent speech, existing models have shown unsatisfactory performance in detecting implicitly expressed violent speech. This paper proposes a violence speech detection method based on BERT and Hanyu Pinyin and emotion assistance, and its effectiveness and advancement are validated on multiple datasets. Subsequently, the experimental results are analyzed to summarize the characteristics of Chinese violent speech, facilitating further development in violence speech detection efforts in the future.
引用
收藏
页码:443 / 447
页数:5
相关论文
共 50 条
  • [31] AnchiBERT: A Pre-Trained Model for Ancient Chinese Language Understanding and Generation
    Tian, Huishuang
    Yang, Kexin
    Liu, Dayiheng
    Lv, Jiancheng
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [32] EMBERT: A Pre-trained Language Model for Chinese Medical Text Mining
    Cai, Zerui
    Zhang, Taolin
    Wang, Chengyu
    He, Xiaofeng
    WEB AND BIG DATA, APWEB-WAIM 2021, PT I, 2021, 12858 : 242 - 257
  • [33] JiuZhang: A Chinese Pre-trained Language Model for Mathematical Problem Understanding
    Zhao, Wayne Xin
    Zhou, Kun
    Gong, Zheng
    Zhang, Beichen
    Zhou, Yuanhang
    Sha, Jing
    Chen, Zhigang
    Wang, Shijin
    Liu, Cong
    Wen, Ji-Rong
    PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 4571 - 4581
  • [34] SiBert: Enhanced Chinese Pre-trained Language Model with Sentence Insertion
    Chen, Jiahao
    Cao, Chenjie
    Jiang, Xiuyan
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 2405 - 2412
  • [35] Lithography Hotspot Detection Method Based on Pre-trained VGG11 Model
    Liao Lufeng
    Li Sikun
    Wang Xiangzhao
    ACTA OPTICA SINICA, 2023, 43 (03)
  • [36] DFEPT: Data Flow Embedding for Enhancing Pre-Trained Model Based Vulnerability Detection
    Jiang, Zhonghao
    Sun, Weifeng
    Gu, Xiaoyan
    Wu, Jiaxin
    Wen, Tao
    Hu, Haibo
    Yan, Meng
    PROCEEDINGS OF THE 15TH ASIA-PACIFIC SYMPOSIUM ON INTERNETWARE, INTERNETWARE 2024, 2024, : 95 - 104
  • [37] XPhoneBERT: A Pre-trained Multilingual Model for Phoneme Representations for Text-to-Speech
    Nguyen, Linh The
    Pham, Thinh
    Nguyen, Dat Quoc
    INTERSPEECH 2023, 2023, : 5506 - 5510
  • [38] UAV Detection using Web Application Approach based on SSD Pre-Trained Model
    Wastupranata, Leonard Matheus
    Munir, Rinaldi
    PROCEEDINGS OF THE 2021 IEEE INTERNATIONAL CONFERENCE ON AEROSPACE ELECTRONICS AND REMOTE SENSING TECHNOLOGY (ICARES 2021), 2021,
  • [39] Incorporating emoji sentiment information into a pre-trained language model for Chinese and English sentiment analysis
    Huang, Jiaming
    Li, Xianyong
    Li, Qizhi
    Du, Yajun
    Fan, Yongquan
    Chen, Xiaoliang
    Huang, Dong
    Wang, Shumin
    Li, Xianyong
    INTELLIGENT DATA ANALYSIS, 2024, 28 (06) : 1601 - 1625
  • [40] Entity Recognition for Chinese Hazardous Chemical Accident Data Based on Rules and a Pre-Trained Model
    Dai, Hui
    Zhu, Mu
    Yuan, Guan
    Niu, Yaowei
    Shi, Hongxing
    Chen, Boxuan
    APPLIED SCIENCES-BASEL, 2023, 13 (01):