Hybrid Topic Cluster Models for Social Healthcare Data

被引:0
|
作者
Prasad, K. Rajendra [1 ]
Mohammed, Moulana [2 ]
Noorullah, R. M. [2 ]
机构
[1] Inst Aeronaut Engn, Dept CSE, Hyderabad, India
[2] Koneru Lakshmaiah Univ, Dept CSE, Guntur, Andhra Pradesh, India
关键词
Multi-viewpoint based metric; traditional topic models; hybrid topic models; topic visualization; health tendency;
D O I
10.14569/IJACSA.2019.0101168
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Social media and in particular, microblogs are becoming an important data source for disease surveillance, behavioral medicine, and public healthcare. Topic Models are widely used in microblog analytics for analyzing and integrating the textual data within a corpus. This paper uses health tweets as microblogs and attempts the health data clustering by topic models. The traditional topic models, such as Latent Semantic Indexing (LSI), Probabilistic Latent Schematic Indexing (PLSI), Latent Dirichlet Allocation (LDA), Non-negative Matrix Factorization (NMF), and integer Joint NMF(intJNMF) methods are used for health data clustering; however, they are intractable to assess the number of health topic clusters. Proper visualizations are essential to extract the information from and identifying trends of data, as they may include thousands of documents and millions of words. For visualization of topic clouds and health tendency in the document collection, we present hybrid topic models by integrating traditional topic models with VAT. Proposed hybrid topic models viz., Visual Non-negative Matrix Factorization (VNMF), Visual Latent Dirichlet Allocation (VLDA), Visual Probabilistic Latent Schematic Indexing (VPLSI) and Visual Latent Schematic Indexing (VLSI) are promising methods for accessing the health tendency and visualization of topic clusters from benchmarked and Twitter datasets. Evaluation and comparison of hybrid topic models are presented in the experimental section for demonstrating the efficiency with different distance measures, include, Euclidean distance, cosine distance, and multi-viewpoint cosine similarity.
引用
收藏
页码:490 / 506
页数:17
相关论文
共 50 条
  • [41] Hybrid cluster and data envelopment analysis with interval data
    Kianfar, K.
    Namin, M. Ahadzadeh
    Tabriz, A. Alam
    Najafi, E.
    Lotfi, F. Hosseinzadeh
    SCIENTIA IRANICA, 2018, 25 (05) : 2904 - 2911
  • [42] Online Clustering for Topic Detection in Social Data Streams
    Comito, Carmela
    Pizzuti, Clara
    Procopio, Nicola
    2016 IEEE 28TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2016), 2016, : 362 - 369
  • [43] Incorporating Social Role Theory into Topic Models for Social Media Content Analysis
    Zhao, Wayne Xin
    Wang, Jinpeng
    He, Yulan
    Nie, Jian-Yun
    Wen, Ji-Rong
    Li, Xiaoming
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2015, 27 (04) : 1032 - 1044
  • [44] Geographical Topic Modelling on Spatial Social Network Data
    Funkner, Anastasia A.
    Elkhovskaya, Liubov O.
    Lenivtceva, Iuliia D.
    Egorov, Michil P.
    Kshenin, Aleksandr D.
    Khrulkov, Aleksandr A.
    10TH INTERNATIONAL YOUNG SCIENTISTS CONFERENCE IN COMPUTATIONAL SCIENCE (YSC2021), 2021, 193 : 22 - 31
  • [45] Constructing Topic Hierarchies from Social Media Data
    Zhang, Yuhao
    Mao, Wenji
    Zeng, Daniel
    2015 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOP (ICDMW), 2015, : 1015 - 1018
  • [46] Topic Modeling and Visualization for Big Data in Social Sciences
    Sukhija, Nitin
    Tatineni, Mahidhar
    Brown, Nicole
    Van Moer, Mark
    Rodriguez, Paul
    Callicott, Spencer
    2016 INT IEEE CONFERENCES ON UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTING, SCALABLE COMPUTING AND COMMUNICATIONS, CLOUD AND BIG DATA COMPUTING, INTERNET OF PEOPLE, AND SMART WORLD CONGRESS (UIC/ATC/SCALCOM/CBDCOM/IOP/SMARTWORLD), 2016, : 1198 - 1205
  • [47] Data Protection in Healthcare Social Networks
    Li, Jingquan
    IEEE SOFTWARE, 2014, 31 (01) : 46 - 53
  • [48] Building predictive models of healthcare costs with open healthcare data
    Rao, A. Ravishankar
    Garai, Subrata
    Dey, Soumyabrata
    Peng, Hang
    2020 8TH IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI 2020), 2020, : 486 - 488
  • [49] Topic-Noise Models: Modeling Topic and Noise Distributions in Social Media Post Collections
    Churchill, Rob
    Singh, Lisa
    2021 21ST IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2021), 2021, : 71 - 80
  • [50] Cluster Based Outlier Detection Algorithm For Healthcare Data
    Christy, A.
    MeeraGandhi, G.
    Vaithyasubramanian, S.
    BIG DATA, CLOUD AND COMPUTING CHALLENGES, 2015, 50 : 209 - 215