Sentiment Classification of Crowdsourcing Participants' Reviews Text Based on LDA Topic Model

被引:19
|
作者
Huang, Yanrong [1 ]
Wang, Rui [2 ,3 ]
Huang, Bin [1 ]
Wei, Bo [4 ]
Zheng, Shu Li [1 ]
Chen, Min [5 ]
机构
[1] Zhejiang Univ Water Resource & Elect Power, Coll Econ & Management, Hangzhou 310018, Peoples R China
[2] Jiangxi Univ Sci & Technol, Sch Econ & Management, Ganzhou 341000, Peoples R China
[3] Fujian Agr & Forestry Univ, Sch Econ, Fuzhou 350000, Peoples R China
[4] Zhejiang Sci Tech Univ, Sch Informat Sci & Technol, Hangzhou 310018, Peoples R China
[5] Wuhan Univ, Sch Comp Sci, State Key Lab Software Engn, Wuhan 430072, Peoples R China
基金
中国国家自然科学基金;
关键词
Crowdsourcing; Feature extraction; Dictionaries; Support vector machines; Sentiment analysis; Classification algorithms; Text categorization; text classification; LDA topic model; crowdsourcing participants;
D O I
10.1109/ACCESS.2021.3101565
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The review text received by crowdsourcing participants contains valuable knowledge, opinions, and preferences, which is an important basis for employers to make trading decisions, and crowdsourcing participants to improve service level and quality. However, there are two kinds of emotional polarity in the review text, the attention paid to sentiment classification of review text with fuzzy emotional boundaries is insufficient. This paper proposes a supervised text sentiment classification method with Latent Dirichlet Allocation (LDA) to improve the classification performance of review text with fuzzy sentiment boundaries. Taking the review text of crowdsourcing participants on the Zhubajie platform as the data set, using N-gram, Word2vec, and TF-IDF algorithms to extract text features. The LDA topic model is applied to expand the number of text features and extract eight topics that affect employers' sentiment tendencies. Text classifiers are constructed based on Support Vector Machine (SVM), Random Forest (RF), Gradient Boosting Decision Tree (GDBT), and Extreme Gradient Boosting (XGBoost) algorithms, and the effectiveness of the sentiment classification methods are verified by ten-fold cross-validation and confusion matrix. Experimental results show that using the LDA topic model to extend the features of review text can effectively alleviate the problem that the classifier is difficult to distinguish the sentiment categories of different emotion polarity words coexisting text, and enhance the ability of emotion boundary fuzzy text classification. Based on TF-IDF and LDA to extract and expand text features, the GBDT text sentiment classifier with the accuracy of 0.881; the F1-measure of the second, third, fourth, and fifth categories samples are 0.462, 0.571, 0.278, and 0.647 respectively, which is better than SVM, RF, and XGBoost classifiers and has the best classification performance.
引用
收藏
页码:108131 / 108143
页数:13
相关论文
共 50 条
  • [21] Topic analysis based on LDA model
    College of Computer Science and Engineering, Changchun University of Technology, Changchun 130012, China
    不详
    不详
    Zidonghua Xuebao Acta Auto. Sin., 2009, 12 (1586-1592):
  • [22] Topic-based sentiment analysis of hotel reviews
    Gharzouli, Mohamed
    Hamama, Aimen Khalil
    Khattabi, Zakaria
    CURRENT ISSUES IN TOURISM, 2022, 25 (09) : 1368 - 1375
  • [23] Integrating Topic, Sentiment, and Syntax for Modeling Online Reviews: A Topic Model Approach
    Tang, Min
    Jin, Jian
    Liu, Ying
    Li, Chunping
    Zhang, Weiwen
    JOURNAL OF COMPUTING AND INFORMATION SCIENCE IN ENGINEERING, 2019, 19 (01)
  • [24] Multi-aspect Blog Sentiment Analysis Based on LDA Topic Model and Hownet Lexicon
    Fu, Xianghua
    Liu, Guo
    Guo, Yanyan
    Guo, Wubiao
    WEB INFORMATION SYSTEMS AND MINING, PT II, 2011, 6988 : 131 - 138
  • [25] Text feature selection for sentiment classification of Chinese online reviews
    Wang, Hongwei
    Yin, Pei
    Yao, Jiani
    Liu, James N. K.
    JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE, 2013, 25 (04) : 425 - 439
  • [26] Enhanced Sentiment Classification for Informal Myanmar Text of Restaurant Reviews
    Aye, Yu Mon
    Aung, Sint Sint
    2018 IEEE/ACIS 16TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING RESEARCH, MANAGEMENT AND APPLICATION (SERA), 2018, : 31 - 36
  • [27] Query Classification using LDA Topic Model and Sparse Representation Based Classifier
    Bhattacharya, Indrani
    Sil, Jaya
    PROCEEDINGS OF THE THIRD ACM IKDD CONFERENCE ON DATA SCIENCES (CODS), 2016,
  • [28] Joint Sentiment Topic Model for objective text clustering
    Sanchez, Octavio
    Sierra, Gerardo
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2019, 36 (04) : 3119 - 3128
  • [29] Text Similarity Computing Based on LDA Topic Model and Word Co-occurrence
    Shao, Minglai
    Qin, Liangxi
    PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, KNOWLEDGE ENGINEERING AND INFORMATION ENGINEERING (SEKEIE 2014), 2014, 114 : 199 - 203
  • [30] Weakly Supervised Feature Compression Based Topic Model for Sentiment Classification
    Hu, Yan
    Xu, Xiaofei
    Li, Li
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT (KSEM 2017): 10TH INTERNATIONAL CONFERENCE, KSEM 2017, MELBOURNE, VIC, AUSTRALIA, AUGUST 19-20, 2017, PROCEEDINGS, 2017, 10412 : 29 - 41