Weakly supervised topic sentiment joint model with word embeddings

被引:31
|
作者
Fu, Xianghua [1 ]
Sun, Xudong [1 ]
Wu, Haiying [1 ]
Cui, Laizhong [1 ]
Huang, Joshua Zhexue [1 ]
机构
[1] Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen, Peoples R China
关键词
Sentiment analysis; Topic model; Topic sentiment joint model; Word embeddings;
D O I
10.1016/j.knosys.2018.02.012
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Topic sentiment joint model aims to deal with the problem about the mixture of topics and sentiment simultaneously from online reviews. Most of existing topic sentiment modeling algorithms are mainly based on the state-of-art latent Dirichlet allocation (LDA) and probabilistic latent semantic analysis (PLSA), which infer sentiment and topic distributions from the co-occurrence of words. These methods have been proposed and successfully used for topic and sentiment analysis. However, when the training corpus is small or when the documents are short, the textual features become sparse, so that the results of the sentiment and topic distributions might be not very satisfied. In this paper, we propose a novel topic sentiment joint model called weakly supervised topic sentiment joint model with word embeddings (WS-TSWE), which incorporates word embeddings and HowNet lexicon simultaneously to improve the topic identification and sentiment recognition. The main contributions of WS-TSWE include the following two aspects. (1) Existing models generate the words only from the sentiment-topic-to-word Dirichlet multinomial component, but the WS-TSWE model replaces it with a mixture of two components, a Dirichlet multinomial component and a word embeddings component. Since the word embeddings are trained on a very large corpora and can be used to extend the semantic information of the words, they can provide a certain solution for the problem of the textual sparse. (2) Most of previous models incorporate sentiment knowledge in the beta priors. And the priors are usually set from a dictionary and completely rely on previous domain knowledge to identify positive and negative words. In contrast, the WS-TSWE model calculates the sentiment orientation of each word with the HowNet lexicon and automatically infers sentiment-based beta priors for sentiment analysis and opinion mining. Furthermore, we implement WS-TSWE with Gibbs sampling algorithms. The experimental results on Chinese and English data sets show that WS-TSWE achieved significant performance in the task of detecting sentiment and topics simultaneously. (c) 2018 Elsevier B.V. All rights reserved.
引用
收藏
页码:43 / 54
页数:12
相关论文
共 50 条
  • [21] Revisiting Supervised Word Embeddings
    Vu, Dieu
    Truong, Khang
    Nguyen, Khanh
    Van Linh, Ngo
    Than, Khoat
    JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2022, 38 (02) : 413 - 427
  • [22] A clustering-based topic model using word networks and word embeddings
    Wenchuan Mu
    Kwan Hui Lim
    Junhua Liu
    Shanika Karunasekera
    Lucia Falzon
    Aaron Harwood
    Journal of Big Data, 9
  • [23] A clustering-based topic model using word networks and word embeddings
    Mu, Wenchuan
    Lim, Kwan Hui
    Liu, Junhua
    Karunasekera, Shanika
    Falzon, Lucia
    Harwood, Aaron
    JOURNAL OF BIG DATA, 2022, 9 (01)
  • [24] Short Text Topic Model with Word Embeddings and Context Information
    Zhang, Xianchao
    Feng, Ran
    Liang, Wenxin
    RECENT ADVANCES IN INFORMATION AND COMMUNICATION TECHNOLOGY 2018, 2019, 769 : 55 - 64
  • [25] Refining Word Embeddings with Sentiment Information for Sentiment Analysis
    Kasri M.
    Birjali M.
    Nabil M.
    Beni-Hssane A.
    El-Ansari A.
    El Fissaoui M.
    Journal of ICT Standardization, 2022, 10 (03): : 353 - 382
  • [26] Word Embeddings for Arabic Sentiment Analysis
    Altowayan, A. Aziz
    Tao, Lixin
    2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2016, : 3820 - 3825
  • [27] Joint Sentiment Topic Model for objective text clustering
    Sanchez, Octavio
    Sierra, Gerardo
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2019, 36 (04) : 3119 - 3128
  • [28] Latent Sentiment Model for Weakly-Supervised Cross-Lingual Sentiment Classification
    He, Yulan
    ADVANCES IN INFORMATION RETRIEVAL, 2011, 6611 : 214 - 225
  • [29] Sentiment classification with word localization based on weakly supervised learning with a convolutional neural network
    Lee, Gichang
    Jeong, Jaeyun
    Seo, Seungwan
    Kim, CzangYeob
    Kang, Pilsung
    KNOWLEDGE-BASED SYSTEMS, 2018, 152 : 70 - 82
  • [30] SENTIMENT ANALYSIS OF MICROBLOG TEXT BASED ON JOINT SENTIMENT-TOPIC MODEL
    Zhang, Hui
    Liu, Yiqun
    Ma, Shaoping
    2014 IEEE 3RD INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENCE SYSTEMS (CCIS), 2014, : 46 - 54