Performance evaluation of machine learning models on large dataset of android applications reviews

被引:5
|
作者
Qureshi, Ali Adil [1 ]
Ahmad, Maqsood [2 ]
Ullah, Saleem [1 ]
Yasir, Muhammad Naveed [3 ]
Rustam, Furqan [4 ]
Ashraf, Imran [5 ]
机构
[1] Khwaja Fareed Univ Engn & Informat Technol, Dept Comp Sci, Rahim Yar Khan 64200, Pakistan
[2] Islamia Univ Bahawalpur, Dept Informat Secur, Bahawalpur 63100, Punjab, Pakistan
[3] Univ Narowal, Dept Comp Sci, Narowal 51600, Pakistan
[4] Univ Coll Dublin, Sch Comp Sci, Dublin, Ireland
[5] Yeungnam Univ, Informat & Commun Engn, Gyongsan 38541, South Korea
关键词
Opinion mining; Sentiment analysis; Mobile apps reviews; Google Play Store; CLASSIFICATION;
D O I
10.1007/s11042-023-14713-6
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With an ever-increasing number of mobile users, the development of mobile applications (apps) has become a potential market during the past decade. Billions of users download mobile apps for divergent use from Google Play Store, fulfill tasks and leave comments about their experience. Such reviews are replete with a variety of feedback that serves as a guide for the improvement of existing apps and intuition for novel mobile apps. However, application reviews are challenging and very broad to approach. Such reviews, when segregated into different classes guide the user in the selection of suitable apps. This study proposes a framework for analyzing the sentiment of reviews for apps of eight different categories like shopping, sports, casual, etc. A large dataset is scrapped comprising 251661 user reviews with the help of 'Regular Expression' and 'Beautiful Soup'. The framework follows the use of different machine learning models along with the term frequency-inverse document frequency (TF-IDF) for feature extraction. Extensive experiments are performed using preprocessing steps, as well as, the stats feature of app reviews to evaluate the performance of the models. Results indicate that combining the stats feature with TF-IDF shows better performance and the support vector machine obtains the highest accuracy. Experimental results can potentially be used by other researchers to select appropriate models for the analysis of app reviews. In addition, the provided dataset is large, diverse, and balanced with eight categories and 59 app reviews and provides the opportunity to analyze reviews using state-of-the-art approaches.
引用
收藏
页码:37197 / 37219
页数:23
相关论文
共 50 条
  • [21] Performance evaluation of software defect prediction with NASA dataset using machine learning techniques
    Siddiqui T.
    Mustaqeem M.
    International Journal of Information Technology, 2023, 15 (8) : 4131 - 4139
  • [22] Safety Score as an Evaluation Metric for Machine Learning Models of Security Applications
    Salman, Tara
    Ghubaish, Ali
    Unal, Devrim
    Jain, Raj
    Salman, Tara (tara.salman@wustl.edu), 1600, Institute of Electrical and Electronics Engineers Inc. (02): : 207 - 211
  • [23] Performance and usability of machine learning for screening in systematic reviews: a comparative evaluation of three tools
    Allison Gates
    Samantha Guitard
    Jennifer Pillay
    Sarah A. Elliott
    Michele P. Dyson
    Amanda S. Newton
    Lisa Hartling
    Systematic Reviews, 8
  • [24] Performance and usability of machine learning for screening in systematic reviews: a comparative evaluation of three tools
    Gates, Allison
    Guitard, Samantha
    Pillay, Jennifer
    Elliott, Sarah A.
    Dyson, Michele P.
    Newton, Amanda S.
    Hartling, Lisa
    SYSTEMATIC REVIEWS, 2019, 8 (01)
  • [25] Comparative evaluation of machine learning classifiers with Obesity dataset
    Ramya, A.
    Rohini, K.
    2021 INTERNATIONAL CONFERENCE ON COMPUTING SCIENCES (ICCS 2021), 2021, : 38 - 41
  • [26] Application of Big Data Analytics and Machine Learning to Large-Scale Synchrophasor Datasets: Evaluation of Dataset 'Machine Learning-Readiness'
    Hart, Philip
    He, Lijun
    Wang, Tianyi
    Kumar, Vijay S.
    Aggour, Kareem
    Subramanian, Arun
    Yan, Weizhong
    IEEE OPEN ACCESS JOURNAL OF POWER AND ENERGY, 2022, 9 : 386 - 397
  • [27] Machine Learning Models for Financial Applications
    Emad, Ahmed
    Abbas, Hazem
    Khalil, Mahmoud
    5TH INTERNATIONAL CONFERENCE ON E-COMMERCE, E-BUSINESS AND E-GOVERNMENT, ICEEG 2021, 2021, : 85 - 90
  • [28] Reproducing Reaction Mechanisms with Machine-Learning Models Trained on a Large-Scale Mechanistic Dataset
    Joung, Joonyoung F.
    Fong, Mun Hong
    Roh, Jihye
    Tu, Zhengkai
    Bradshaw, John
    Coley, Connor W.
    ANGEWANDTE CHEMIE-INTERNATIONAL EDITION, 2024, 63 (43)
  • [29] A new machine learning-based method for android malware detection on imbalanced dataset
    Diyana Tehrany Dehkordy
    Abbas Rasoolzadegan
    Multimedia Tools and Applications, 2021, 80 : 24533 - 24554
  • [30] Estimating Evaluation of Cosmetics Reviews with Machine Learning Methods
    Ma, Qing
    Tsukagoshi, Miran
    Murata, Masaki
    2020 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP 2020), 2020, : 259 - 263