Evaluation of text semantic features using latent dirichlet allocation model

被引:0
|
作者
Zhou C. [1 ]
Li N. [2 ,3 ]
Zhang C. [2 ]
Yang X. [1 ]
机构
[1] Beijing Key Laboratory of Information Service Engineering, Beijing Union University, Beijing
[2] Collaborative Innovation Center of eTourism, Tourism College, Beijing Union University, Beijing
[3] Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing
关键词
Creative computing; Information architecture; Latent Dirichlet allocation (LDA); Online reviews; Semantic feature;
D O I
10.23940/ijpe.20.06.p15.968978
中图分类号
O212 [数理统计];
学科分类号
摘要
Obtaining useful information from mass data on the Internet has been a hot topic in information process research in recent years. For unstructural data like online reviews based on natural languages, it becomes more challenging. Online consumer reviews reflect customers' real experience and opinions on products or services. However, there are short of methods or tools to help potential customers find high-quality and helpful reviews from a large number of reviews. This paper applied the concept and idea of creative computing to solve this problem. Tf-idf, as a traditional method to extract text features, measures the importance of words through word frequency and ignores the semantic information in the text data, while the topic model makes up for this deficiency. This paper proposed to use the vector of reviews allocated by LDA topic model to represent text semantic features. Basing on semantic features of reviews, it calculated cosine similarity between the thumb up reviews and other reviews and thus obtain the simulated helpfulness scores of all reviews. Then, a linear regression was designed to obtain two features, i.e., the syntax and semantic features, and determine the simulated helpfulness scores. The proposed method was validated by collected online tourism reviews of Forbidden City and Mount Huang on three Chinese representative online tourism platforms. The results showed that the proposed method can effectively obtain and thus compare the helpfulness of online reviews in a creative way. © 2020 Totem Publisher, Inc. All rights reserved.
引用
收藏
页码:968 / 978
页数:10
相关论文
共 50 条
  • [21] Accuracy of Unit Under Test Identification Using Latent Semantic Analysis and Latent Dirichlet Allocation
    Madeja, Matej
    Poruban, Jaroslav
    2019 IEEE 15TH INTERNATIONAL SCIENTIFIC CONFERENCE ON INFORMATICS (INFORMATICS 2019), 2019, : 161 - 166
  • [22] Extraction of Proper Names from Myanmar Text Using Latent Dirichlet Allocation
    Win, Yuzana
    Masada, Tomonari
    2016 CONFERENCE ON TECHNOLOGIES AND APPLICATIONS OF ARTIFICIAL INTELLIGENCE (TAAI), 2016, : 96 - 103
  • [23] Text data analysis using Latent Dirichlet Allocation: an application to FOMC transcripts
    Edison, Hali
    Carcel, Hector
    APPLIED ECONOMICS LETTERS, 2021, 28 (01) : 38 - 42
  • [24] Analysis of latent Dirichlet allocation and non-negative matrix factorization using latent semantic indexing
    Saqib, Sheikh Muhammad
    Ahmad, Shakeel
    Syed, Asif Hassan
    Naeem, Tariq
    Alotaibi, Fahad Mazaed
    INTERNATIONAL JOURNAL OF ADVANCED AND APPLIED SCIENCES, 2019, 6 (10): : 94 - 102
  • [25] Experimenting with Latent Semantic Analysis and Latent Dirichlet Allocation on Automated Essay Grading
    Hoblos, Jalaa
    2020 SEVENTH INTERNATIONAL CONFERENCE ON SOCIAL NETWORK ANALYSIS, MANAGEMENT AND SECURITY (SNAMS), 2020, : 153 - 159
  • [26] Supervised Latent Dirichlet Allocation With Covariates: A Bayesian Structural and Measurement Model of Text and Covariates
    Wilcox, Kenneth Tyler
    Jacobucci, Ross
    Zhang, Zhiyong
    Ammerman, Brooke A. A.
    PSYCHOLOGICAL METHODS, 2023, 28 (05) : 1178 - 1206
  • [27] A Latent Dirichlet Allocation and Fuzzy Clustering Based Machine Learning Model for Text Thesaurus
    Luo, J.
    Yu, D.
    Dai, Z.
    INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL, 2020, 15 (02)
  • [28] Evaluation of Stability and Similarity of Latent Dirichlet Allocation
    Tang, Jun
    Huo, Ruilong
    Yao, Jiali
    2013 FOURTH WORLD CONGRESS ON SOFTWARE ENGINEERING (WCSE), 2013, : 78 - 83
  • [29] Using Latent Dirichlet Allocation to Improve Text Classification Performance of Support Vector Machine
    Chen, Yaw-Huei
    Li, Shu-Fong
    2016 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2016, : 1280 - 1286
  • [30] Applying Latent Dirichlet Allocation Technique to Classify Topics on Sustainability Using Arabic Text
    Al Qudah, Islam
    Hashem, Ibrahim
    Soufyane, Abdelaziz
    Chen, Weisi
    Merabtene, Tarek
    INTELLIGENT COMPUTING, VOL 1, 2022, 506 : 630 - 638