Sentiment Classification of Crowdsourcing Participants' Reviews Text Based on LDA Topic Model

被引：19

作者：

Huang, Yanrong ^{[1
]}

Wang, Rui ^{[2
,3
]}

Huang, Bin ^{[1
]}

Wei, Bo ^{[4
]}

Zheng, Shu Li ^{[1
]}

Chen, Min ^{[5
]}

机构：

[1] Zhejiang Univ Water Resource & Elect Power, Coll Econ & Management, Hangzhou 310018, Peoples R China

[2] Jiangxi Univ Sci & Technol, Sch Econ & Management, Ganzhou 341000, Peoples R China

[3] Fujian Agr & Forestry Univ, Sch Econ, Fuzhou 350000, Peoples R China

[4] Zhejiang Sci Tech Univ, Sch Informat Sci & Technol, Hangzhou 310018, Peoples R China

[5] Wuhan Univ, Sch Comp Sci, State Key Lab Software Engn, Wuhan 430072, Peoples R China

来源：

IEEE ACCESS | 2021年 / 9卷

基金：

中国国家自然科学基金;

关键词：

Crowdsourcing; Feature extraction; Dictionaries; Support vector machines; Sentiment analysis; Classification algorithms; Text categorization; text classification; LDA topic model; crowdsourcing participants;

D O I：

10.1109/ACCESS.2021.3101565

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The review text received by crowdsourcing participants contains valuable knowledge, opinions, and preferences, which is an important basis for employers to make trading decisions, and crowdsourcing participants to improve service level and quality. However, there are two kinds of emotional polarity in the review text, the attention paid to sentiment classification of review text with fuzzy emotional boundaries is insufficient. This paper proposes a supervised text sentiment classification method with Latent Dirichlet Allocation (LDA) to improve the classification performance of review text with fuzzy sentiment boundaries. Taking the review text of crowdsourcing participants on the Zhubajie platform as the data set, using N-gram, Word2vec, and TF-IDF algorithms to extract text features. The LDA topic model is applied to expand the number of text features and extract eight topics that affect employers' sentiment tendencies. Text classifiers are constructed based on Support Vector Machine (SVM), Random Forest (RF), Gradient Boosting Decision Tree (GDBT), and Extreme Gradient Boosting (XGBoost) algorithms, and the effectiveness of the sentiment classification methods are verified by ten-fold cross-validation and confusion matrix. Experimental results show that using the LDA topic model to extend the features of review text can effectively alleviate the problem that the classifier is difficult to distinguish the sentiment categories of different emotion polarity words coexisting text, and enhance the ability of emotion boundary fuzzy text classification. Based on TF-IDF and LDA to extract and expand text features, the GBDT text sentiment classifier with the accuracy of 0.881; the F1-measure of the second, third, fourth, and fifth categories samples are 0.462, 0.571, 0.278, and 0.647 respectively, which is better than SVM, RF, and XGBoost classifiers and has the best classification performance.

引用

页码：108131 / 108143

页数：13

共 50 条

[1] SHORT TEXT CLASSIFICATION BASED ON LDA TOPIC MODEL
Chen, Qiuxing
Yao, Lixiu
Yang, Jie
PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON AUDIO, LANGUAGE AND IMAGE PROCESSING (ICALIP), 2016, : 749 - 753
[2] A short text sentiment-topic model for product reviews
Xiong, Shufeng
Wang, Kuiyi
Ji, Donghong
Wang, Bingkun
NEUROCOMPUTING, 2018, 297 : 94 - 102
[3] Double LDA: A Sentiment Analysis Model Based On Topic Model
Chen, Xue
Tang, Wenqing
Xu, Hao
Hu, Xiaofeng
2014 10TH INTERNATIONAL CONFERENCE ON SEMANTICS, KNOWLEDGE AND GRIDS (SKG), 2014, : 49 - 56
[4] Sentiment Classification Based on AS-LDA Model
Liang, Jiguang
Liu, Ping
Tan, Jianlong
Bai, Shuo
2ND INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND QUANTITATIVE MANAGEMENT, ITQM 2014, 2014, 31 : 511 - 516
[5] Multi-grain sentiment/topic model based on LDA
Ouyang, Ji-Hong
Liu, Yan-Hui
Li, Xi-Ming
Zhou, Xiao-Tang
Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2015, 43 (09): : 1875 - 1880
[6] Analyzing tourism reviews using an LDA topic-based sentiment analysis approach
Ali, Twil
Omar, Bencharef
Soulaimane, Kaloun
METHODSX, 2022, 9
[7] Dependency-Topic-Affects-Sentiment-LDA Model for Sentiment Analysis
Yin, Shunshun
Han, Jun
Huang, Yu
Kumar, Kuldeep
2014 IEEE 26TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2014, : 413 - 418
[8] Text classification method based on self-training and LDA topic models
Pavlinek, Miha
Podgorelec, Vili
EXPERT SYSTEMS WITH APPLICATIONS, 2017, 80 : 83 - 93
[9] SENTIMENT ANALYSIS OF MICROBLOG TEXT BASED ON JOINT SENTIMENT-TOPIC MODEL
Zhang, Hui
Liu, Yiqun
Ma, Shaoping
2014 IEEE 3RD INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENCE SYSTEMS (CCIS), 2014, : 46 - 54
[10] Text classification based on Labeled-LDA model
Li, Wen-Bo
Sun, Le
Zhang, Da-Kun
2008, Science Press (31):

← 1 2 3 4 5 →