Cyberbullying Detection for Urdu Language Using Machine Learning

被引:0
|
作者
Mustafa, Hamza [1 ]
Zafar, Kashif [1 ]
机构
[1] Natl Univ Comp & Emerging Sci, Lahore, Pakistan
关键词
Cyberbullying Detection; Urdu language; Machine Learning;
D O I
10.1007/978-3-031-62871-9_19
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As social media users are rapidly growing, cyberbullying detection has become an increasingly essential topic for research. Cyberbullying is described as the sending or posting of text or photos designed to hurt or portray shame to another person or group of people using the Internet, cell phones, video game systems, or other technology. Cyberbullying is the use of social media to hurt or embarrass any other person. In the past, research in cyberbullying detection in English has been carried out, and cyberbullying detection involving the Urdu language has been ignored mostly because the Urdu language lacks resources. Urdu is a widely spoken language, especially in some parts of South Asia. It is the National Language of Pakistan. In this research, machine learning-based approaches are used for cyberbullying detection. (Dataset collection and source). The dataset is labeled by several different native speakers. The majority voting scheme is used for assigning a final label to a Tweet. Threemain feature extraction techniques used for the detection of cyberbullying are TF-IDF, BOW, and Glove, and different machine learning algorithms are implemented. After doing multiple experiments, it has been proven that the Extra Tree Classifier (ETC) with TFIDF outperformed other algorithms having a 79% accuracy score. However, the proposed approach performed better than the reported approaches that are based on machine learning for the Urdu language on sentiment analysis. Our experimental results also performed better on the sentiment dataset.
引用
收藏
页码:244 / 257
页数:14
相关论文
共 50 条
  • [31] Cyberbullying detection and machine learning: a systematic literature review
    Vimala Balakrisnan
    Mohammed Kaity
    Artificial Intelligence Review, 2023, 56 : 1375 - 1416
  • [32] Cyberbullying Detection using BERT for Telugu Language
    Talasila, Sri Lakshmi
    Kothuri, Dharani Priya
    Manchiraju, Savithri Jahnavi
    Mallavalli, Mutyala Sai Sasank
    Dande, Lourdu Gnana Harshith
    2024 4TH INTERNATIONAL CONFERENCE ON PERVASIVE COMPUTING AND SOCIAL NETWORKING, ICPCSN 2024, 2024, : 454 - 461
  • [33] Cyberbullying detection and machine learning: a systematic literature review
    Balakrisnan, Vimala
    Kaity, Mohammed
    ARTIFICIAL INTELLIGENCE REVIEW, 2023, 56 (SUPPL 1) : 1375 - 1416
  • [34] A Machine Learning Approach to Cyberbullying Detection in Arabic Tweets
    Musleh, Dhiaa
    Rahman, Atta
    Alkherallah, Mohammed Abbas
    Al-Bohassan, Menhal Kamel
    Alawami, Mustafa Mohammed
    Alsebaa, Hayder Ali
    Alnemer, Jawad Ali
    Al-Mutairi, Ghazi Fayez
    Aldossary, May Issa
    Aldowaihi, Dalal A.
    Alhaidari, Fahd
    CMC-COMPUTERS MATERIALS & CONTINUA, 2024, 80 (01): : 1033 - 1054
  • [35] Detection of Touchscreen-Based Urdu Braille Characters Using Machine Learning Techniques
    Shokat, Sana
    Riaz, Rabia
    Rizvi, Sanam Shahla
    Khan, Inayat
    Paul, Anand
    MOBILE INFORMATION SYSTEMS, 2021, 2021
  • [36] Detection of Touchscreen-Based Urdu Braille Characters Using Machine Learning Techniques
    Shokat, Sana
    Riaz, Rabia
    Rizvi, Sanam Shahla
    Khan, Inayat
    Paul, Anand
    Mobile Information Systems, 2021, 2021
  • [37] Detecting A Twitter Cyberbullying Using Machine Learning
    Dalvi, Rahul Ramesh
    Chavan, Sudhanshu Baliram
    Halbe, Apama
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICICCS 2020), 2020, : 297 - 301
  • [38] Recognition of Urdu sign language: a systematic review of the machine learning classification
    Zahid H.
    Rashid M.
    Hussain S.
    Azim F.
    Syed S.A.
    Saad A.
    PeerJ Computer Science, 2022, 8
  • [39] Multilingual Detection of Cyberbullying in Mixed Urdu, Roman Urdu, and English Social Media Conversations
    Razi, Fakhra
    Ejaz, Naveed
    IEEE ACCESS, 2024, 12 : 105201 - 105210
  • [40] Recognition of Urdu sign language: a systematic review of the machine learning classification
    Zahid, Hira
    Rashid, Munaf
    Hussain, Samreen
    Azim, Fahad
    Syed, Sidra Abid
    Saad, Afshan
    PEERJ COMPUTER SCIENCE, 2022, 8