Evaluation of text semantic features using latent dirichlet allocation model

被引:0
|
作者
Zhou C. [1 ]
Li N. [2 ,3 ]
Zhang C. [2 ]
Yang X. [1 ]
机构
[1] Beijing Key Laboratory of Information Service Engineering, Beijing Union University, Beijing
[2] Collaborative Innovation Center of eTourism, Tourism College, Beijing Union University, Beijing
[3] Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing
关键词
Creative computing; Information architecture; Latent Dirichlet allocation (LDA); Online reviews; Semantic feature;
D O I
10.23940/ijpe.20.06.p15.968978
中图分类号
O212 [数理统计];
学科分类号
摘要
Obtaining useful information from mass data on the Internet has been a hot topic in information process research in recent years. For unstructural data like online reviews based on natural languages, it becomes more challenging. Online consumer reviews reflect customers' real experience and opinions on products or services. However, there are short of methods or tools to help potential customers find high-quality and helpful reviews from a large number of reviews. This paper applied the concept and idea of creative computing to solve this problem. Tf-idf, as a traditional method to extract text features, measures the importance of words through word frequency and ignores the semantic information in the text data, while the topic model makes up for this deficiency. This paper proposed to use the vector of reviews allocated by LDA topic model to represent text semantic features. Basing on semantic features of reviews, it calculated cosine similarity between the thumb up reviews and other reviews and thus obtain the simulated helpfulness scores of all reviews. Then, a linear regression was designed to obtain two features, i.e., the syntax and semantic features, and determine the simulated helpfulness scores. The proposed method was validated by collected online tourism reviews of Forbidden City and Mount Huang on three Chinese representative online tourism platforms. The results showed that the proposed method can effectively obtain and thus compare the helpfulness of online reviews in a creative way. © 2020 Totem Publisher, Inc. All rights reserved.
引用
收藏
页码:968 / 978
页数:10
相关论文
共 50 条
  • [41] Rail transit fault text classification based on the latent dirichlet allocation
    Li, R.
    Su, S.
    Wang, G.
    Qu, J.
    Cao, Y.
    2021 IEEE INTELLIGENT TRANSPORTATION SYSTEMS CONFERENCE (ITSC), 2021, : 1359 - 1364
  • [42] Text mining of Reddit posts: Using latent Dirichlet allocation to identify common parenting issues
    Westrupp, Elizabeth M.
    Greenwood, Christopher J.
    Fuller-Tyszkiewicz, Matthew
    Berkowitz, Tomer S.
    Hagg, Lauryn
    Youssef, George
    PLOS ONE, 2022, 17 (02):
  • [43] Full-Text or Abstract? Examining Topic Coherence Scores Using Latent Dirichlet Allocation
    Syed, Shaheen
    Spruit, Marco
    2017 IEEE INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA), 2017, : 165 - 174
  • [44] On Privacy Protection of Latent Dirichlet Allocation Model Training
    Zhao, Fangyuan
    Ren, Xuebin
    Yang, Shusen
    Yang, Xinyu
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 4860 - 4866
  • [45] Research Topic Analysis in Engineering Management Using a Latent Dirichlet Allocation Model
    Kim, Jin Ho
    Chen, Weiru
    JOURNAL OF INDUSTRIAL INTEGRATION AND MANAGEMENT-INNOVATION AND ENTREPRENEURSHIP, 2018, 3 (04):
  • [46] Topic Model Allocation of Conversational Dialogue Records by Latent Dirichlet Allocation
    Yeh, Jui-Feng
    Lee, Chen-Hsien
    Tan, Yi-Shiuan
    Yu, Liang-Chih
    2014 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2014,
  • [47] Improving the Latent Dirichlet Allocation Document Model With WordNet
    Isaly, Laura
    Trias, Eric
    Peterson, Gilbert
    PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON INFORMATION WARFARE AND SECURITY, 2010, : 163 - 170
  • [48] Latent Dirichlet allocation model for world trade analysis
    Kozlowski, Diego
    Semeshenko, Viktoriya
    Molinari, Andrea
    PLOS ONE, 2021, 16 (02):
  • [49] Latent Dirichlet Allocation Model Training With Differential Privacy
    Zhao, Fangyuan
    Ren, Xuebin
    Yang, Shusen
    Han, Qing
    Zhao, Peng
    Yang, Xinyu
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2021, 16 : 1290 - 1305
  • [50] Image cluster and retrieval with latent dirichlet allocation model
    Cao, Yudong
    Sun, Fuming
    Wang, Dongxia
    Zhou, Jun
    International Journal of Digital Content Technology and its Applications, 2012, 6 (18) : 89 - 98