An experimental study on the performance of collaborative filtering based on user reviews for large-scale datasets

被引:2
|
作者
Al-Ghuribi, Sumaia [1 ,2 ]
Noah, Shahrul Azman Mohd [1 ]
Mohammed, Mawal [3 ]
机构
[1] Univ Kebangsaan Malaysia, Ctr Artificial Intelligence Technol, Bangi, Selangor, Malaysia
[2] Taiz Univ, Fac Appl Sci, Dept Comp Sci, Taizi, Yemen
[3] Prince Sattam Bin Abdulaziz Univ, Dept Software Engn, Alkharj, Saudi Arabia
关键词
Collaborative filtering; Recommender systems; User reviews; Sentiment analysis; RECOMMENDER SYSTEMS; SENTIMENT ANALYSIS;
D O I
10.7717/peerj-cs.1525
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Collaborative filtering (CF) approaches generate user recommendations based on user similarities. These similarities are calculated based on the overall (explicit) user ratings. However, in some domains, such ratings may be sparse or unavailable. User reviews can play a significant role in such cases, as implicit ratings can be derived from the reviews using sentiment analysis, a natural language processing technique. However, most current studies calculate the implicit ratings by simply aggregating the scores of all sentiment words appearing in reviews and, thus, ignoring the elements of sentiment degrees and aspects of user reviews. This study addresses this issue by calculating the implicit rating differently, leveraging the rich information in user reviews by using both sentiment words and aspect-sentiment word pairs to enhance the CF performance. It proposes four methods to calculate the implicit ratings on large-scale datasets: the first considers the degree of sentiment words, while the second exploits the aspects by extracting aspect-sentiment word pairs to calculate the implicit ratings. The remaining two methods combine explicit ratings with the implicit ratings generated by the first two methods. The generated ratings are then incorporated into different CF rating prediction algorithms to evaluate their effectiveness in enhancing the CF performance. Evaluative experiments of the proposed methods are conducted on two large-scale datasets: Amazon and Yelp. Results of the experiments show that the proposed ratings improved the accuracy of CF rating prediction algorithms and outperformed the explicit ratings in terms of three predictive accuracy metrics.
引用
收藏
页数:26
相关论文
共 50 条
  • [41] Momentum Online LDA for Large-scale Datasets
    Ouyang, Jihong
    Lu, You
    Li, Ximing
    21ST EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE (ECAI 2014), 2014, 263 : 1075 - 1076
  • [42] Large-Scale Datasets in Special Education Research
    Griffin, Megan M.
    Steinbrecher, Trisha D.
    USING SECONDARY DATASETS TO UNDERSTAND PERSONS WITH DEVELOPMENTAL DISABILITIES AND THEIR FAMILIES, 2013, 45 : 155 - 183
  • [43] Experimental investigation into the performance of large-scale earthing electrodes
    Guo, D.
    Lathi, D.
    Griffiths, H.
    Harid, N.
    Ainsley, A.
    Haddad, A.
    Gaodianya Jishu/High Voltage Engineering, 2011, 37 (11): : 2733 - 2738
  • [44] EXPERIMENTAL STUDY OF TWO LARGE-SCALE MODELS' SEAKEEPING PERFORMANCE IN COASTAL WAVES
    Sun, Shu-zheng
    Ren, Hui-long
    Zhao, Xiao-dong
    Li, Ji-de
    BRODOGRADNJA, 2015, 66 (02): : 47 - 60
  • [45] Towards algorithmic analytics for large-scale datasets
    Danilo Bzdok
    Thomas E. Nichols
    Stephen M. Smith
    Nature Machine Intelligence, 2019, 1 : 296 - 306
  • [46] IGD: high-performance search for large-scale genomic interval datasets
    Feng, Jianglin
    Sheffield, Nathan C.
    BIOINFORMATICS, 2021, 37 (01) : 118 - 120
  • [47] Iterative Classification for Sanitizing Large-Scale Datasets
    Li, Bo
    Vorobeychik, Yevgeniy
    Li, Muqun
    Malin, Bradley
    2015 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2015, : 841 - 846
  • [48] Large-scale collaborative vehicle routing
    Los, Johan
    Schulte, Frederik
    Gansterer, Margaretha
    Hartl, Richard F.
    Spaan, Matthijs T. J.
    Negenborn, Rudy R.
    ANNALS OF OPERATIONS RESEARCH, 2022,
  • [49] The future of large-scale collaborative proteomics
    Dowsey, Andrew W.
    Yang, Guang-Zhong
    PROCEEDINGS OF THE IEEE, 2008, 96 (08) : 1292 - 1309
  • [50] Study on collaborative filtering recommendation algorithm based on web user clustering
    Chen, Ke
    Peng, Zhiping
    Ke, Wende
    International Journal of Wireless and Mobile Computing, 2012, 5 (04) : 401 - 408