Cyberbullying Detection for Urdu Language Using Machine Learning

被引:0
|
作者
Mustafa, Hamza [1 ]
Zafar, Kashif [1 ]
机构
[1] Natl Univ Comp & Emerging Sci, Lahore, Pakistan
关键词
Cyberbullying Detection; Urdu language; Machine Learning;
D O I
10.1007/978-3-031-62871-9_19
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As social media users are rapidly growing, cyberbullying detection has become an increasingly essential topic for research. Cyberbullying is described as the sending or posting of text or photos designed to hurt or portray shame to another person or group of people using the Internet, cell phones, video game systems, or other technology. Cyberbullying is the use of social media to hurt or embarrass any other person. In the past, research in cyberbullying detection in English has been carried out, and cyberbullying detection involving the Urdu language has been ignored mostly because the Urdu language lacks resources. Urdu is a widely spoken language, especially in some parts of South Asia. It is the National Language of Pakistan. In this research, machine learning-based approaches are used for cyberbullying detection. (Dataset collection and source). The dataset is labeled by several different native speakers. The majority voting scheme is used for assigning a final label to a Tweet. Threemain feature extraction techniques used for the detection of cyberbullying are TF-IDF, BOW, and Glove, and different machine learning algorithms are implemented. After doing multiple experiments, it has been proven that the Extra Tree Classifier (ETC) with TFIDF outperformed other algorithms having a 79% accuracy score. However, the proposed approach performed better than the reported approaches that are based on machine learning for the Urdu language on sentiment analysis. Our experimental results also performed better on the sentiment dataset.
引用
收藏
页码:244 / 257
页数:14
相关论文
共 50 条
  • [21] Automatic detection of cyberbullying and threatening in Saudi tweets using machine learning
    Alghamdi, Deema
    Al-Motery, Rahaf
    Alma'abdi, Reem
    Alzamzami, Ohoud
    Babour, Amal
    INTERNATIONAL JOURNAL OF ADVANCED AND APPLIED SCIENCES, 2021, 8 (10): : 17 - 25
  • [22] A Review of Machine Learning Techniques in Cyberbullying Detection
    Sultan, Daniyar
    Omarov, Batyrkhan
    Kozhamkulova, Zhazira
    Kazbekova, Gulnur
    Alimzhanova, Laura
    Dautbayeva, Aigul
    Zholdassov, Yernar
    Abdrakhmanov, Rustam
    CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 74 (03): : 5625 - 5640
  • [23] Sign Language Detection Using Machine Learning
    Ilanchezhian, P.
    Singh, I. Amit Kumar
    Balaji, M.
    Kumar, A. Manoj
    Yaseen, S. Muhamad
    SEMANTIC INTELLIGENCE, ISIC 2022, 2023, 964 : 135 - 143
  • [24] Cyberbullying detection: advanced preprocessing techniques & deep learning architecture for Roman Urdu data
    Amirita Dewani
    Mohsin Ali Memon
    Sania Bhatti
    Journal of Big Data, 8
  • [25] Cyberbullying detection: advanced preprocessing techniques & deep learning architecture for Roman Urdu data
    Dewani, Amirita
    Memon, Mohsin Ali
    Bhatti, Sania
    JOURNAL OF BIG DATA, 2021, 8 (01)
  • [26] DEVELOPMENT OF COMPUTATIONAL LINGUISTIC RESOURCES FOR AUTOMATED DETECTION OF TEXTUAL CYBERBULLYING THREATS IN ROMAN URDU LANGUAGE
    Dewani, Amirita
    Memon, Mohsin Ali
    Bhatti, Sania
    3C TIC, 2021, 10 (02): : 101 - 121
  • [27] Improving cyberbullying detection using Twitter users' psychological features and machine learning
    Balakrishnan, Vimala
    Khan, Shahzaib
    Arabnia, Hamid R.
    COMPUTERS & SECURITY, 2020, 90 (90)
  • [28] A Deep Analysis of Textual Features Based Cyberbullying Detection Using Machine Learning
    Mahmud, Md Ishtyaq
    Mamun, Muntasir
    Abdelgawad, Ahmed
    2022 IEEE GLOBAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND INTERNET OF THINGS (GCAIOT), 2022, : 166 - 170
  • [29] Stylometry-driven framework for Urdu intrinsic plagiarism detection: a comprehensive analysis using machine learning, deep learning, and large language models
    Manzoor, Muhammad Faraz
    Farooq, Muhammad Shoaib
    Abid, Adnan
    Neural Computing and Applications, 2025, 37 (09) : 6479 - 6513
  • [30] BERS: Bussiness-Related Emotion Recognition System in Urdu Language Using Machine Learning
    Sana, Iqra
    Nasir, Khushboo
    Urooj, Amara
    Ishaq, Zain
    Hameed, Ibrahim A.
    2018 5TH INTERNATIONAL CONFERENCE ON BEHAVIORAL, ECONOMIC, AND SOCIO-CULTURAL COMPUTING (BESC), 2018, : 238 - 242