Hate Special Detection In Indonesian Language Instagram

被引:0
|
作者
Putra, I. Gede Manggala [1 ]
Nurjanah, Dade [1 ]
机构
[1] Telkom Univ, Informat Engn, Bandung, Indonesia
关键词
hate speech comments; instagram; word2vec; TextCNN; imbalance dataset;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Hate speech is a form of communication which contains hatred by doing things, such as inciting, insulting, disparaging, or demeaning a person or group. Hate speech issues in Indonesia often have linkages to politics. In 2018 and 2019, for example, the hate speech relates to the local leader and presidential elections. The hate speech actors commonly use social networks, such as Instagram, to spread their hatred words. About 60% of hate speech is found in the comments of the posts and it will be a real threat if not quickly detected. Our study aims to detect hate speech in Instagram comments. We propose the use of a word2vec method with skip-gram models and a modified TextCNN to learn and detect hate speech texts. Furthermore, random oversampling, random under sampling, and class weight was used to solve imbalanced dataset problems. The results show that the best accuracy, in term of F-score, is 93.70%, gained from a combination of word2vec skip-gram with window size 15, a modified TextCNN, and random oversampling methods.
引用
收藏
页码:413 / 419
页数:7
相关论文
共 50 条
  • [41] Application-specific word embeddings for hate and offensive language detection
    Soto, Claver P.
    Nunes, Gustavo M. S.
    Gomes, Jose Gabriel R. C.
    Nedjah, Nadia
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (19) : 27111 - 27136
  • [42] HateThaiSent: Sentiment-Aided Hate Speech Detection in Thai Language
    Maity, Krishanu
    Poornash, A. S.
    Bhattacharya, Shaubhik
    Phosit, Salisa
    Kongsamlit, Sawarod
    Saha, Sriparna
    Pasupa, Kitsuchart
    IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2024, 11 (05) : 1 - 14
  • [43] Hate Speech Detection Using Large Language Models: A Comprehensive Review
    Albladi, Aish
    Islam, Minarul
    Das, Amit
    Bigonah, Maryam
    Zhang, Zheng
    Jamshidi, Fatemeh
    Rahgouy, Mostafa
    Raychawdhary, Nilanjana
    Marghitu, Daniela
    Seals, Cheryl
    IEEE ACCESS, 2025, 13 : 20871 - 20892
  • [44] Plagiarism Detection for Indonesian Language using Winnowing with Parallel Processing
    Arifin, Y.
    Isa, S. M.
    Wulandhari, L. A.
    Abdurachman, E.
    2ND INTERNATIONAL CONFERENCE ON COMPUTING AND APPLIED INFORMATICS 2017, 2018, 978
  • [45] Ngalawan Ujaran Sengit: hate speech detection in indonesian code-mixed social media data
    Pamungkas, Endang Wahyu
    Chiril, Patricia
    LANGUAGE RESOURCES AND EVALUATION, 2025,
  • [46] Instagram Spam Detection
    Zhang, Wuxain
    Sun, Hung-Min
    2017 IEEE 22ND PACIFIC RIM INTERNATIONAL SYMPOSIUM ON DEPENDABLE COMPUTING (PRDC 2017), 2017, : 227 - 228
  • [47] Classification of Hate Speech Language Detection on Social Media: Preliminary Study for Improvement
    Muzakir, Ari
    Adi, Kusworo
    Kusumaningrum, Retno
    EMERGING TRENDS IN INTELLIGENT SYSTEMS & NETWORK SECURITY, 2023, 147 : 146 - 156
  • [48] Comparing pre-trained language models for Spanish hate speech detection
    Miriam Plaza-del-Arco, Flor
    Dolores Molina-Gonzalez, M.
    Alfonso Urena-Lopez, L.
    Teresa Martin-Valdivia, M.
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 166
  • [50] Hate Speech Detection and Reclaimed Language: Mitigating False Positives and Compounded Discrimination
    Zsisku, Eszter
    Zubiaga, Arkaitz
    Dubossarsky, Haim
    16TH ACM WEB SCIENCE CONFERENCE, WEBSCIENCE 2024, 2024, : 241 - 249