Hybrid Topic Cluster Models for Social Healthcare Data

被引:0
|
作者
Prasad, K. Rajendra [1 ]
Mohammed, Moulana [2 ]
Noorullah, R. M. [2 ]
机构
[1] Inst Aeronaut Engn, Dept CSE, Hyderabad, India
[2] Koneru Lakshmaiah Univ, Dept CSE, Guntur, Andhra Pradesh, India
关键词
Multi-viewpoint based metric; traditional topic models; hybrid topic models; topic visualization; health tendency;
D O I
10.14569/IJACSA.2019.0101168
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Social media and in particular, microblogs are becoming an important data source for disease surveillance, behavioral medicine, and public healthcare. Topic Models are widely used in microblog analytics for analyzing and integrating the textual data within a corpus. This paper uses health tweets as microblogs and attempts the health data clustering by topic models. The traditional topic models, such as Latent Semantic Indexing (LSI), Probabilistic Latent Schematic Indexing (PLSI), Latent Dirichlet Allocation (LDA), Non-negative Matrix Factorization (NMF), and integer Joint NMF(intJNMF) methods are used for health data clustering; however, they are intractable to assess the number of health topic clusters. Proper visualizations are essential to extract the information from and identifying trends of data, as they may include thousands of documents and millions of words. For visualization of topic clouds and health tendency in the document collection, we present hybrid topic models by integrating traditional topic models with VAT. Proposed hybrid topic models viz., Visual Non-negative Matrix Factorization (VNMF), Visual Latent Dirichlet Allocation (VLDA), Visual Probabilistic Latent Schematic Indexing (VPLSI) and Visual Latent Schematic Indexing (VLSI) are promising methods for accessing the health tendency and visualization of topic clusters from benchmarked and Twitter datasets. Evaluation and comparison of hybrid topic models are presented in the experimental section for demonstrating the efficiency with different distance measures, include, Euclidean distance, cosine distance, and multi-viewpoint cosine similarity.
引用
收藏
页码:490 / 506
页数:17
相关论文
共 50 条
  • [31] Topic-aware Social Influence Propagation Models
    Barbieri, Nicola
    Bonchi, Francesco
    Manco, Giuseppe
    12TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2012), 2012, : 81 - 90
  • [32] Topic-aware social influence propagation models
    Barbieri, Nicola
    Bonchi, Francesco
    Manco, Giuseppe
    KNOWLEDGE AND INFORMATION SYSTEMS, 2013, 37 (03) : 555 - 584
  • [33] Using Probabilistic Topic Models in Enterprise Social Software
    Christidis, Konstantinos
    Mentzas, Gregoris
    BUSINESS INFORMATION SYSTEMS, PROCEEDINGS, 2010, 47 : 23 - 34
  • [34] Stochastic topic models for large scale and nonstationary data
    Ihou, Koffi Eddy
    Bouguila, Nizar
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2020, 88 (88)
  • [35] An Axiomatic Inspection of the Behavior of Topic Models with Data Aggregation
    Deolalikar, Vinay
    2014 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2014,
  • [36] Optimizing Modality Weights in Topic Models of Transactional Data
    K. Ya. Khrylchenko
    K. V. Vorontsov
    Automation and Remote Control, 2022, 83 : 1908 - 1922
  • [37] Probabilistic Topic Models for Text Data Retrieval and Analysis
    Zhai, ChengXiang
    SIGIR'17: PROCEEDINGS OF THE 40TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2017, : 1399 - 1401
  • [38] Optimizing Modality Weights in Topic Models of Transactional Data
    Khrylchenko, K. Ya.
    Vorontsov, K. V.
    AUTOMATION AND REMOTE CONTROL, 2022, 83 (12) : 1908 - 1922
  • [39] Investigation of the Quality of Topic Models for Noisy Data Sources
    Geeganage, Dakshi T. Kapugamam
    Xu, Yue
    Li, Yuefeng
    2018 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE (WI 2018), 2018, : 488 - 493
  • [40] Combining feature norms and text data with topic models
    Steyvers, Mark
    ACTA PSYCHOLOGICA, 2010, 133 (03) : 234 - 243