Natural Language Processing for Depression Prediction on Sina Weibo: Method Study and Analysis

被引:0
|
作者
Zhang, Zhenwen [1 ]
Zhu, Jianghong [1 ]
Guo, Zhihua [1 ]
Zhang, Yu [1 ]
Li, Zepeng [1 ]
Hu, Bin [1 ]
机构
[1] Lanzhou Univ, Sch Informat Sci & Engn, Gansu Prov Key Lab Wearable Comp, 222 South Tianshui Rd, Lanzhou 730000, Peoples R China
来源
JMIR MENTAL HEALTH | 2024年 / 11卷
基金
中国国家自然科学基金;
关键词
depression; social media; natural language processing; deep learning; mental health; statistical analysis; linguistic analysis; Sina Weibo; risk prediction; mood analysis; SEVERITY; HEALTH; VALIDITY;
D O I
10.2196/58259
中图分类号
R749 [精神病学];
学科分类号
100205 ;
摘要
Background: Depression represents a pressing global public health concern, impacting the physical and mental well-being of hundreds of millions worldwide. Notwithstanding advances in clinical practice, an alarming number of individuals at risk for depression continue to face significant barriers to timely diagnosis and effective treatment, thereby exacerbating a burgeoning social health crisis. Objective: This study seeks to develop a novel online depression risk detection method using natural language processing technology to identify individuals at risk of depression on the Chinese social media platform Sina Weibo. Methods: First, we collected approximately 527,333 posts publicly shared over 1 year from 1600 individuals with depression and 1600 individuals without depression on the Sina Weibo platform. We then developed a hierarchical transformer network for learning user-level semantic representations, which consists of 3 primary components: a word-level encoder, a post-level encoder, and a semantic aggregation encoder. The word-level encoder learns semantic embeddings from individual posts, while the post-level encoder explores features in user post sequences. The semantic aggregation encoder aggregates post sequence semantics to generate a user-level semantic representation that can be classified as depressed or nondepressed. Next, a classifier is employed to predict the risk of depression. Finally, we conducted statistical and linguistic analyses of the post content from individuals with and without depression using the Chinese Linguistic Inquiry and Word Count. Results: We divided the original data set into training, validation, and test sets. The training set consisted of 1000 individuals with depression and 1000 individuals without depression. Similarly, each validation and test set comprised 600 users, with 300 individuals from both cohorts (depression and nondepression). Our method achieved an accuracy of 84.62%, precision of 84.43%, recall of 84.50%, and F1-score of 84.32% on the test set without employing sampling techniques. However, by applying our proposed retrieval-based sampling strategy, we observed significant improvements in performance: an accuracy of 95.46%, precision of 95.30%, recall of 95.70%, and F1-score of 95.43%. These outstanding results clearly demonstrate the effectiveness and superiority of our proposed depression risk detection model and retrieval-based sampling technique. This breakthrough provides new insights for large-scale depression detection through social media. Through language behavior analysis, we discovered that individuals with depression are more likely to use negation words (the value of "swear" is 0.001253). This may indicate the presence of negative emotions, rejection, doubt, disagreement, or aversion in individuals with depression. Additionally, our analysis revealed that individuals with depression tend to use negative emotional vocabulary in their expressions ("NegEmo": 0.022306; "Anx": 0.003829; "Anger": 0.004327; "Sad": 0.005740), which may reflect their internal negative emotions and psychological state. This frequent use of negative vocabulary could be a way for individuals with depression to express negative feelings toward life, themselves, or their surrounding environment. Conclusions: The research results indicate the feasibility and effectiveness of using deep learning methods to detect the risk of depression. These findings provide insights into the potential for large-scale, automated, and noninvasive prediction of depression among online social media users.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] A Study on Trend Prediction in Sina weibo Community
    Chen Fu
    Zhan Shaobin
    Shi Guangjun
    2014 IEEE INTERNATIONAL CONGRESS ON BIG DATA (BIGDATA CONGRESS), 2014, : 364 - 365
  • [2] Popularity Prediction in Microblogging Network: A Case Study on Sina Weibo
    Bao, Peng
    Shen, Hua-Wei
    Huang, Junming
    Cheng, Xue-Qi
    PROCEEDINGS OF THE 22ND INTERNATIONAL CONFERENCE ON WORLD WIDE WEB (WWW'13 COMPANION), 2013, : 177 - 178
  • [3] A Novel Reposting Prediction Method Based on Quantified Microblog Hotness in Sina Weibo
    Zhu, Hailong
    Wang, Min
    INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND APPLICATION ENGINEERING (CSAE), 2017, 190 : 13 - 18
  • [4] A Multimodal Feature Fusion-Based Method for Individual Depression Detection on Sina Weibo
    Wang, Yiding
    Wang, Zhenyi
    Li, Chenghao
    Zhang, Yilin
    Wang, Haizhou
    2020 IEEE 39TH INTERNATIONAL PERFORMANCE COMPUTING AND COMMUNICATIONS CONFERENCE (IPCCC), 2020,
  • [5] Retweet Prediction in Sina Weibo Based on Entity-Level Sentiment Analysis
    Wang, Chen
    Jia, Yan
    Huang, Jiu-ming
    Zhou, Bin
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE: TECHNIQUES AND APPLICATIONS, AITA 2016, 2016, : 343 - 350
  • [6] Framing of news media responsibility for depression: a content analysis based on Sina Weibo's coverage of depression
    Ma, Xiaoxia
    Cheng, Yuan
    Mahamed, Mastura Binti
    Alwie, Aryaty Binti
    INTERNATIONAL JOURNAL OF MENTAL HEALTH NURSING, 2024, 33 : 68 - 69
  • [7] A content analysis of depression-related discourses on Sina Weibo: attribution, efficacy, and information sources
    Jiabao Pan
    Bingjie Liu
    Gary L. Kreps
    BMC Public Health, 18
  • [8] A content analysis of depression-related discourses on Sina Weibo: attribution, efficacy, and information sources
    Pan, Jiabao
    Liu, Bingjie
    Kreps, Gary L.
    BMC PUBLIC HEALTH, 2018, 18
  • [9] Psilocybin Therapy for Treatment Resistant Depression: Prediction of Clinical Outcome by Natural Language Processing
    Dougherty, Robert
    Clarke, Patrick
    Atli, Merve
    Kuk, Joanna
    Dunlop, Boadie
    Young, Allan
    Goodwin, Guy
    Ryslik, Gregory
    NEUROPSYCHOPHARMACOLOGY, 2023, 48 : 255 - 255
  • [10] Psilocybin therapy for treatment resistant depression: prediction of clinical outcome by natural language processing
    Dougherty, Robert F.
    Clarke, Patrick
    Atli, Merve
    Kuc, Joanna
    Schlosser, Danielle
    Dunlop, Boadie W.
    Hellerstein, David J.
    Aaronson, Scott T.
    Zisook, Sidney
    Young, Allan H.
    Carhart-Harris, Robin
    Goodwin, Guy M.
    Ryslik, Gregory A.
    PSYCHOPHARMACOLOGY, 2023,