Sentiment Classification of Crowdsourcing Participants' Reviews Text Based on LDA Topic Model

被引:19
|
作者
Huang, Yanrong [1 ]
Wang, Rui [2 ,3 ]
Huang, Bin [1 ]
Wei, Bo [4 ]
Zheng, Shu Li [1 ]
Chen, Min [5 ]
机构
[1] Zhejiang Univ Water Resource & Elect Power, Coll Econ & Management, Hangzhou 310018, Peoples R China
[2] Jiangxi Univ Sci & Technol, Sch Econ & Management, Ganzhou 341000, Peoples R China
[3] Fujian Agr & Forestry Univ, Sch Econ, Fuzhou 350000, Peoples R China
[4] Zhejiang Sci Tech Univ, Sch Informat Sci & Technol, Hangzhou 310018, Peoples R China
[5] Wuhan Univ, Sch Comp Sci, State Key Lab Software Engn, Wuhan 430072, Peoples R China
基金
中国国家自然科学基金;
关键词
Crowdsourcing; Feature extraction; Dictionaries; Support vector machines; Sentiment analysis; Classification algorithms; Text categorization; text classification; LDA topic model; crowdsourcing participants;
D O I
10.1109/ACCESS.2021.3101565
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The review text received by crowdsourcing participants contains valuable knowledge, opinions, and preferences, which is an important basis for employers to make trading decisions, and crowdsourcing participants to improve service level and quality. However, there are two kinds of emotional polarity in the review text, the attention paid to sentiment classification of review text with fuzzy emotional boundaries is insufficient. This paper proposes a supervised text sentiment classification method with Latent Dirichlet Allocation (LDA) to improve the classification performance of review text with fuzzy sentiment boundaries. Taking the review text of crowdsourcing participants on the Zhubajie platform as the data set, using N-gram, Word2vec, and TF-IDF algorithms to extract text features. The LDA topic model is applied to expand the number of text features and extract eight topics that affect employers' sentiment tendencies. Text classifiers are constructed based on Support Vector Machine (SVM), Random Forest (RF), Gradient Boosting Decision Tree (GDBT), and Extreme Gradient Boosting (XGBoost) algorithms, and the effectiveness of the sentiment classification methods are verified by ten-fold cross-validation and confusion matrix. Experimental results show that using the LDA topic model to extend the features of review text can effectively alleviate the problem that the classifier is difficult to distinguish the sentiment categories of different emotion polarity words coexisting text, and enhance the ability of emotion boundary fuzzy text classification. Based on TF-IDF and LDA to extract and expand text features, the GBDT text sentiment classifier with the accuracy of 0.881; the F1-measure of the second, third, fourth, and fifth categories samples are 0.462, 0.571, 0.278, and 0.647 respectively, which is better than SVM, RF, and XGBoost classifiers and has the best classification performance.
引用
收藏
页码:108131 / 108143
页数:13
相关论文
共 50 条
  • [41] A LDA model based topic detection method
    1600, Northwestern Polytechnical University (34):
  • [42] Twin labeled LDA: a supervised topic model for document classification
    Wei Wang
    Bing Guo
    Yan Shen
    Han Yang
    Yaosen Chen
    Xinhua Suo
    Applied Intelligence, 2020, 50 : 4602 - 4615
  • [43] A topic sentence-based instance transfer method for imbalanced sentiment classification of Chinese product reviews
    Tian, Feng
    Wu, Fan
    Chao, Kuo-Ming
    Zheng, Qinghua
    Shah, Nazaraf
    Lan, Tian
    Yue, Jia
    ELECTRONIC COMMERCE RESEARCH AND APPLICATIONS, 2016, 16 : 66 - 76
  • [44] Twin labeled LDA: a supervised topic model for document classification
    Wang, Wei
    Guo, Bing
    Shen, Yan
    Yang, Han
    Chen, Yaosen
    Suo, Xinhua
    APPLIED INTELLIGENCE, 2020, 50 (12) : 4602 - 4615
  • [45] Application of Traffic Environment Accident Information Text Processing Technology Based on LDA Topic Model
    Zhang, Junhui
    Shao, Kejia
    Guan, Tianchao
    EKOLOJI, 2019, 28 (107): : 4843 - 4846
  • [46] Classification of Customer Reviews based on Sentiment Analysis
    Graebner, Dietmar
    Zanker, Markus
    Fliedl, Guenther
    Fuchs, Matthias
    INFORMATION AND COMMUNICATION TECHNOLOGIES IN TOURISM 2012, 2012, : 460 - 470
  • [47] Sentiment Analysis and Classification Based On Textual Reviews
    Mouthami, K.
    Devi, K. Nirmala
    Bhaskaran, V. Murali
    2013 INTERNATIONAL CONFERENCE ON INFORMATION COMMUNICATION AND EMBEDDED SYSTEMS (ICICES), 2013, : 271 - 276
  • [48] Sentiment Analysis in Online Reviews Classification using Text Mining Techniques
    Agueda, M.
    Rita, P.
    Guerreiro, P.
    2019 14TH IBERIAN CONFERENCE ON INFORMATION SYSTEMS AND TECHNOLOGIES (CISTI), 2019,
  • [49] Experiments in Text Classification: Analyzing the Sentiment of Electronic Product Reviews in Greek
    Bilianos, Dimitris
    JOURNAL OF QUANTITATIVE LINGUISTICS, 2022, 29 (03) : 374 - 386
  • [50] Evaluation of Product Reviews Based on Text Sentiment Analysis
    Jiang, Yuhao
    Wang, Haiguang
    Yi, Tianlun
    PROCEEDINGS OF 2021 2ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND INFORMATION SYSTEMS (ICAIIS '21), 2021,