A Review of Urdu Sentiment Analysis with Multilingual Perspective: A Case of Urdu and Roman Urdu Language

被引:19
|
作者
Khan, Ihsan Ullah [1 ]
Khan, Aurangzeb [1 ,2 ]
Khan, Wahab [1 ]
Su'ud, Mazliham Mohd [2 ]
Alam, Muhammad Mansoor [3 ]
Subhan, Fazli [2 ,4 ]
Asghar, Muhammad Zubair [5 ]
机构
[1] Univ Sci & Technol, Dept Comp Sci, Bannu 28100, Pakistan
[2] Multimedia Univ, Fac Comp & Informat, Kuala Lumpur 50050, Malaysia
[3] Riphah Int Univ, Rawalpindi 74400, Pakistan
[4] Natl Univ Modern Languages NUML, Fac Engn & Comp Sci, Islamabad 44000, Pakistan
[5] Gomal Univ, Inst Comp & Informat Technol, Dera Ismail Khan 29050, Pakistan
关键词
preprocessing; feature extraction; classification;
D O I
10.3390/computers11010003
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Research efforts in the field of sentiment analysis have exponentially increased in the last few years due to its applicability in areas such as online product purchasing, marketing, and reputation management. Social media and online shopping sites have become a rich source of user-generated data. Manufacturing, sales, and marketing organizations are progressively turning their eyes to this source to get worldwide feedback on their activities and products. Millions of sentences in Urdu and Roman Urdu are posted daily on social sites, such as Facebook, Instagram, Snapchat, and Twitter. Disregarding people's opinions in Urdu and Roman Urdu and considering only resource-rich English language leads to the vital loss of this vast amount of data. Our research focused on collecting research papers related to Urdu and Roman Urdu language and analyzing them in terms of preprocessing, feature extraction, and classification techniques. This paper contains a comprehensive study of research conducted on Roman Urdu and Urdu text for a product review. This study is divided into categories, such as collection of relevant corpora, data preprocessing, feature extraction, classification platforms and approaches, limitations, and future work. The comparison was made based on evaluating different research factors, such as corpus, lexicon, and opinions. Each reviewed paper was evaluated according to some provided benchmarks and categorized accordingly. Based on results obtained and the comparisons made, we suggested some helpful steps in a future study.
引用
收藏
页数:29
相关论文
共 50 条
  • [21] Effects of using Urdu dictionary as a teaching tool for teaching Urdu in Urdu language class room
    Ahmad, Ali
    Aslam, Rana Faqir Muhammad
    Akbara, Muhammad Sajid
    INNOVATION AND CREATIVITY IN EDUCATION, 2010, 2 (02): : 3994 - 3998
  • [22] Deep Learning-Based Sentiment Analysis for Roman Urdu Text
    Ghulam, Hussain
    Zeng, Feng
    Li, Wenjia
    Xiao, Yutong
    2018 INTERNATIONAL CONFERENCE ON IDENTIFICATION, INFORMATION AND KNOWLEDGE IN THE INTERNET OF THINGS, 2019, 147 : 131 - 135
  • [23] Multi-class sentiment analysis of urdu text using multilingual BERT
    Lal Khan
    Ammar Amjad
    Noman Ashraf
    Hsien-Tsung Chang
    Scientific Reports, 12
  • [24] Deep Learning-based Roman-Urdu to Urdu Transliteration
    Alam, Mehreen
    ul Hussain, Sibt
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2021, 35 (04)
  • [25] The role of Roman Urdu in multilingual information retrieval: A regional study
    Safdar, Zanab
    Bajwa, Ruqia Safdar
    Hussain, Shafiq
    Abdullah, Haslinda Binti
    Safdar, Kalsoom
    Draz, Umar
    JOURNAL OF ACADEMIC LIBRARIANSHIP, 2020, 46 (06):
  • [26] Multi-class sentiment analysis of urdu text using multilingual BERT
    Khan, Lal
    Amjad, Ammar
    Ashraf, Noman
    Chang, Hsien-Tsung
    SCIENTIFIC REPORTS, 2022, 12 (01)
  • [27] Enhancing Security of Urdu Language Websites through Urdu CAPTCHA
    Dahar, Imtiaz Ahmed
    Alvi, Fizza Abbas
    Rajput, Ubaidullah
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2020, 20 (11): : 142 - 152
  • [28] A survey on Urdu and Urdu like language stemmers and stemming techniques
    Jabbar, Abdul
    Iqbal, Sajid
    Khan, Muhammad Usman Ghani
    Hussain, Shafiq
    ARTIFICIAL INTELLIGENCE REVIEW, 2018, 49 (03) : 339 - 373
  • [29] A survey on Urdu and Urdu like language stemmers and stemming techniques
    Abdul Jabbar
    Sajid Iqbal
    Muhammad Usman Ghani Khan
    Shafiq Hussain
    Artificial Intelligence Review, 2018, 49 : 339 - 373
  • [30] The state and Urdu language
    Russell, R
    ECONOMIC AND POLITICAL WEEKLY, 1999, 34 (23) : 1382 - 1382