E-Commerce Fake Reviews Detection Using LSTM with Word2Vec Embedding

被引:0
|
作者
Raheem, Mafas [1 ]
Chong, Yi Chien [1 ]
机构
[1] School of Computing, Asia Pacific University of Technology and Innovation, Kuala Lumpur, Malaysia
关键词
Adversarial machine learning - Contrastive Learning - Deep learning - Embeddings - Natural language processing systems;
D O I
10.20532/cit.2024.1005803
中图分类号
学科分类号
摘要
Customer reviews inform potential buyers' decisions, but fake reviews in e-commerce can skew perceptions as customers may feel pressured to leave positive feedback. Detecting fake reviews in e-commerce platforms is a critical challenge, impacting online shopping and deceiving customers. Effective detection strategies, employing deep learning architectures and word embeddings, are essential to combat this issue. Specifically, the study presented in this paper employed a 1-layer Simple LSTM model, a 1D Convolutional model, and a combined CNN+LSTM model. These models were trained using different pre-trained word embeddings including Word2Vec, GloVe, FastText, and with Keras embeddings, to convert the text data into vector form. The models were evaluated based on accuracy and F1-score to provide a comprehensive measure of their performance. The results indicated that the Simple LSTM model with Word2Vec embeddings achieved an accuracy of nearly 91% and an F1-score of 0.9024, outperforming all other model-em-bedding combinations. The 1D convolutional model performed best without any embeddings, suggesting its ability to extract meaningful features from the raw text. The transformer-based models, BERT and DistilBERT, showed progressive learning but struggled with generalization, indicating the need for strategies such as early stopping, dropout, or regularization to prevent overfitting. Notably, the DistilBERT model consistently outperformed the LSTM model, achieving optimal performance with accuracy of 96% and an F1-score of 0.9639 using a batch size of 32 and a learning rate of 4.00E-05. ACM CCS (2012) Classification: Computing methodologies → Artificial intelligence → Natural language processing. © 2024, University of Zagreb Faculty of Electrical Engineering and Computing. All rights reserved.
引用
收藏
页码:65 / 80
相关论文
共 50 条
  • [41] 基于word2vec和双向LSTM的情感分类深度模型
    黄贤英
    刘广峰
    刘小洋
    阳安志
    计算机应用研究, 2019, 36 (12) : 3583 - 3587+3596
  • [42] Fusion of the word2vec word embedding model and cluster analysis for the communication of music intangible cultural heritage
    Hui Ning
    Zhenyu Chen
    Scientific Reports, 13
  • [43] Robust and Consistent Estimation of Word Embedding for Bangla Language by fine-tuning Word2Vec Model
    Rahman, Rifat
    2020 23RD INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY (ICCIT 2020), 2020,
  • [44] Fusion of the word2vec word embedding model and cluster analysis for the communication of music intangible cultural heritage
    Ning, Hui
    Chen, Zhenyu
    SCIENTIFIC REPORTS, 2023, 13 (01)
  • [45] LSTM-PHV: prediction of human-virus protein-protein interactions by LSTM with word2vec
    Tsukiyama, Sho
    Hasan, Md Mehedi
    Fujii, Satoshi
    Kurata, Hiroyuki
    BRIEFINGS IN BIOINFORMATICS, 2021, 22 (06)
  • [46] Method of Profanity Detection Using Word Embedding and LSTM
    Yi, MoungHo
    Lim, MyungJin
    Ko, Hoon
    Shin, JuHyun
    MOBILE INFORMATION SYSTEMS, 2021, 2021
  • [47] Scenario-Based Microservice Retrieval Using Word2Vec
    Ma, Shang-Pin
    Chuang, Yen
    Lan, Ci-Wei
    Chen, Hsi-Min
    Huang, Chun-Ying
    Li, Chia-Yu
    2018 IEEE 15TH INTERNATIONAL CONFERENCE ON E-BUSINESS ENGINEERING (ICEBE 2018), 2018, : 239 - 244
  • [48] Generative Adversarial Networks for text using word2vec intermediaries
    Budhkar, Akshay
    Vishnubhotla, Krishnapriya
    Hossain, Safwan
    Rudzicz, Frank
    4TH WORKSHOP ON REPRESENTATION LEARNING FOR NLP (REPL4NLP-2019), 2019, : 15 - 26
  • [49] Using Part of Speech Tagging for Improving Word2vec Model
    Suleiman, Dima
    Awajan, Arafat A.
    2019 2ND INTERNATIONAL CONFERENCE ON NEW TRENDS IN COMPUTING SCIENCES (ICTCS), 2019, : 213 - 219
  • [50] Modelling of Topic from Hindi Corpus using Word2Vec
    Panigrahi, Sabitra Sankalp
    Panigrahi, Narayan
    Paul, Biswajit
    2018 SECOND INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, CONTROL AND COMMUNICATION TECHNOLOGY (IAC3T), 2018, : 97 - 100