Hate Special Detection In Indonesian Language Instagram

被引:0
|
作者
Putra, I. Gede Manggala [1 ]
Nurjanah, Dade [1 ]
机构
[1] Telkom Univ, Informat Engn, Bandung, Indonesia
关键词
hate speech comments; instagram; word2vec; TextCNN; imbalance dataset;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Hate speech is a form of communication which contains hatred by doing things, such as inciting, insulting, disparaging, or demeaning a person or group. Hate speech issues in Indonesia often have linkages to politics. In 2018 and 2019, for example, the hate speech relates to the local leader and presidential elections. The hate speech actors commonly use social networks, such as Instagram, to spread their hatred words. About 60% of hate speech is found in the comments of the posts and it will be a real threat if not quickly detected. Our study aims to detect hate speech in Instagram comments. We propose the use of a word2vec method with skip-gram models and a modified TextCNN to learn and detect hate speech texts. Furthermore, random oversampling, random under sampling, and class weight was used to solve imbalanced dataset problems. The results show that the best accuracy, in term of F-score, is 93.70%, gained from a combination of word2vec skip-gram with window size 15, a modified TextCNN, and random oversampling methods.
引用
收藏
页码:413 / 419
页数:7
相关论文
共 50 条
  • [1] Hate Speech Detection on Indonesian Instagram Comments using FastText Approach
    Pratiwi, Nur Indah
    Budi, Indra
    Alfina, Ika
    2018 INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER SCIENCE AND INFORMATION SYSTEMS (ICACSIS), 2018, : 447 - 450
  • [2] Hate Speech Detection in Indonesian Language on Instagram Comment Section Using Deep Neural Network Classification Method
    Perdana, Sakti Putra B. B.
    Irawan, Budhi
    Setianingsih, Casi
    2019 IEEE ASIA PACIFIC CONFERENCE ON WIRELESS AND MOBILE (APWIMOB), 2019, : 143 - 149
  • [3] HATE SPEECH DETECTION IN INDONESIAN LANGUAGE ON INSTAGRAM COMMENT SECTION USING K-NEAREST NEIGHBOR CLASSIFICATION METHOD
    Briliani, Annisa
    Irawan, Budhi
    Setianingsih, Casi
    2019 IEEE INTERNATIONAL CONFERENCE ON INTERNET OF THINGS AND INTELLIGENCE SYSTEM (IOTAIS), 2019, : 98 - 104
  • [4] Hate Speech Detection in the Indonesian Language: A Dataset and Preliminary Study
    Alfina, Ika
    Mulia, Rio
    Fanany, Mohamad Ivan
    Ekanata, Yudo
    2017 INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER SCIENCE AND INFORMATION SYSTEMS (ICACSIS), 2017, : 233 - 237
  • [5] Hate speech and abusive language detection in Indonesian social media: Progress and challenges
    Ibrohim, Muhammad Okky
    Budi, Indra
    HELIYON, 2023, 9 (08)
  • [6] Multi-label Hate Speech and Abusive Language Detection in Indonesian Twitter
    Ibrohim, Muhammad Okky
    Budi, Indra
    THIRD WORKSHOP ON ABUSIVE LANGUAGE ONLINE, 2019, : 46 - 57
  • [7] HateBR: A Large Expert Annotated Corpus of Brazilian Instagram Comments for Offensive Language and Hate Speech Detection
    Vargas, Francielle
    Carvalho, Isabelle
    Goes, Fabiana
    Pardo, Thiago A. S.
    Benevenuto, Fabricio
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 7174 - 7183
  • [8] Separating Hate Speech from Abusive Language on Indonesian Twitter
    Ibrahim, Muhammad Amien
    Sagala, Noviyanti Tri Maretta
    Arifin, Samsul
    Nariswari, Rinda
    Murnaka, Nerru Pranuta
    Prasetyo, Puguh Wahyu
    2022 INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ITS APPLICATIONS (ICODSA), 2022, : 187 - 191
  • [9] Language as pride, love, and hate: Archiving emotions through multilingual Instagram hashtags
    Lee, Carmen
    Chau, Dennis
    DISCOURSE CONTEXT & MEDIA, 2018, 22 : 21 - 29
  • [10] Language Agnostic Hate Speech Detection
    Arango, Ayme
    PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, : 2475 - 2475