Tweet topics on cancer among Indian Twitter users-computational approach using latent Dirichlet allocation topic modelling

被引:2
|
作者
Ramamoorthy, Thilagavathi [1 ]
Mappillairaju, Bagavandas [2 ]
机构
[1] SRM Inst Sci & Technol, Sch Publ Hlth, Kattankulathur 603203, Tamil Nadu, India
[2] SRM Inst Sci & Technol, Ctr Stat, Kattankulathur 603203, Tamil Nadu, India
来源
关键词
Twitter; Cancer; Latent Dirichlet allocation; Machine learning; Natural language processing; Social media; Topic modelling;
D O I
10.1007/s42001-023-00222-x
中图分类号
O1 [数学]; C [社会科学总论];
学科分类号
03 ; 0303 ; 0701 ; 070101 ;
摘要
Understanding the extent and content of conversations on cancers inform the stakeholders regarding the needs of the community in terms of knowledge, support and interventions. This study identified the topics of tweet content shared regarding cancer, source of messages and the degree of reachability of identified topics among Twitter users in India. Twitter messages geocoded within India, related to cancer and posted between September 15, 2021 and October 15, 2021 were retrieved using the Twitter application programming interface based on keywords identified from Symplur Signals. The tweets were pre-processed to remove the stop words, hashtags and Uniform Resource Locators. Tweets were visualized using word clouds and correlations between word tokens. Latent Dirichlet allocation (LDA) topic model, an unsupervised machine learning technique was used to identify the commonly discussed cancer topics. A total of 6374 tweets from 3135 unique twitter users were analysed in the study. Majority of the tweets (60.8%) were from the individual twitter users. LDA model identified four topics: (1) prevention, early detection and promotion (36.1%), (2) seeking support and sharing personal experience (15.8%), (3) Human Papillomavirus vaccine and cancer research (13.4%), (4) risk factors, treatment and raising awareness (34.7%). Among the four identified topics, prevention, early detection and promotion had the highest reachability. Twitter is being used as a potential alternative communication platform for disseminating cancer-related information in India. The topics identified in the study provides useful insights for public health professionals and organizations for aligning cancer-related engagement and education for the target audience.
引用
收藏
页码:1033 / 1054
页数:22
相关论文
共 20 条
  • [1] Tweet topics on cancer among Indian Twitter users—computational approach using latent Dirichlet allocation topic modelling
    Thilagavathi Ramamoorthy
    Bagavandas Mappillairaju
    Journal of Computational Social Science, 2023, 6 (2): : 1033 - 1054
  • [2] Using Latent Dirichlet Allocation for Topic Modelling in Twitter
    Ostrowski, David Alfred
    2015 IEEE 9TH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC), 2015, : 493 - 497
  • [3] Topic Modelling Twitter Data with Latent Dirichlet Allocation Method
    Negara, Edi Surya
    Triadi, Dendi
    Andryani, Ria
    2019 3RD INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING AND COMPUTER SCIENCE (ICECOS 2019), 2019, : 386 - 390
  • [4] Road Traffic Topic Modeling on Twitter using Latent Dirichlet Allocation
    Hidayatullah, Ahmad Fathan
    Ma'arif, Muhammad Rifqi
    2017 INTERNATIONAL CONFERENCE ON SUSTAINABLE INFORMATION ENGINEERING AND TECHNOLOGY (SIET), 2017, : 47 - 52
  • [5] Topic Modeling Twitter Data Using Latent Dirichlet Allocation and Latent Semantic Analysis
    Qomariyah, Siti
    Iriawan, Nur
    Fithriasari, Kartika
    2ND INTERNATIONAL CONFERENCE ON SCIENCE, MATHEMATICS, ENVIRONMENT, AND EDUCATION, 2019, 2019, 2194
  • [6] Indonesian's Song Lyrics Topic Modelling using Latent Dirichlet Allocation
    Laoh, Enrico
    Surjandari, Isti
    Febirautami, Limisgy Ramadhina
    2018 5TH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND CONTROL ENGINEERING (ICISCE 2018), 2018, : 270 - 274
  • [7] Temporal trends and spatial distribution of research topics in anthropogenic marine debris study: Topic modelling using latent Dirichlet allocation
    Tomojiri, D.
    Takaya, K.
    Ise, T.
    MARINE POLLUTION BULLETIN, 2022, 182
  • [8] A Fuzzy Approach for Measuring Development of Topics in Patents Using Latent Dirichlet Allocation
    Chen, Hongshu
    Zhang, Guangquan
    Lu, Jie
    Zhu, Donghua
    2015 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE 2015), 2015,
  • [9] A guided latent Dirichlet allocation approach to investigate real-time latent topics of Twitter data during Hurricane Laura
    Zhou, Sulong
    Kan, Pengyu
    Huang, Qunying
    Silbernagel, Janet
    JOURNAL OF INFORMATION SCIENCE, 2023, 49 (02) : 465 - 479
  • [10] Public discourse and sentiment during the COVID 19 pandemic: Using Latent Dirichlet Allocation for topic modeling on Twitter
    Xue, Jia
    Chen, Junxiang
    Chen, Chen
    Zheng, Chengda
    Li, Sijia
    Zhu, Tingshao
    PLOS ONE, 2020, 15 (09):