Harnessing multimodal approaches for depression detection using large language models and facial expressions

被引:0
|
作者
Misha Sadeghi [1 ]
Robert Richer [1 ]
Bernhard Egger [2 ]
Lena Schindler-Gmelch [3 ]
Lydia Helene Rupp [3 ]
Farnaz Rahimi [1 ]
Matthias Berking [3 ]
Bjoern M. Eskofier [1 ]
机构
[1] Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU),Machine Learning and Data Analytics Lab (MaD Lab), Department Artificial Intelligence in Biomedical Engineering (AIBE)
[2] Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU),Chair of Visual Computing (LGDV), Department of Computer Science
[3] Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU),Chair of Clinical Psychology and Psychotherapy (KliPs)
[4] Institute of AI for Health,Translational Digital Health Group
[5] Helmholtz Zentrum München - German Research Center for Environmental Health,undefined
来源
关键词
D O I
10.1038/s44184-024-00112-8
中图分类号
学科分类号
摘要
Detecting depression is a critical component of mental health diagnosis, and accurate assessment is essential for effective treatment. This study introduces a novel, fully automated approach to predicting depression severity using the E-DAIC dataset. We employ Large Language Models (LLMs) to extract depression-related indicators from interview transcripts, utilizing the Patient Health Questionnaire-8 (PHQ-8) score to train the prediction model. Additionally, facial data extracted from video frames is integrated with textual data to create a multimodal model for depression severity prediction. We evaluate three approaches: text-based features, facial features, and a combination of both. Our findings show the best results are achieved by enhancing text data with speech quality assessment, with a mean absolute error of 2.85 and root mean square error of 4.02. This study underscores the potential of automated depression detection, showing text-only models as robust and effective while paving the way for multimodal analysis.
引用
收藏
相关论文
共 50 条
  • [21] SwarMind: Harnessing Large Language Models for Flock Dynamics
    Mounsif, Mehdi
    Zehnder, Killian
    Motie, Yassine
    Adam-Gaxotte, Zoran
    2023 10TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING & MACHINE INTELLIGENCE, ISCMI, 2023, : 171 - 177
  • [22] Harnessing Large Language Models for Cognitive Assistants in Factories
    Freire, S. Kernan
    Foosherian, Mina
    Wang, C.
    Niforatos, E.
    PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON CONVERSATIONAL USER INTERFACES, CUI 2023, 2023,
  • [23] Unobtrusive multimodal emotion detection in adaptive interfaces: Speech and facial expressions
    Truong, Khiet P.
    van Leeuwen, David A.
    Neerincx, Mark A.
    FOUNDATIONS OF AUGMENTED COGNITION, PROCEEDINGS, 2007, 4565 : 354 - +
  • [24] Multimodal Large Language Models in Vision and Ophthalmology
    Lu, Zhiyong
    INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE, 2024, 65 (07)
  • [25] The application of multimodal large language models in medicine
    Qiu, Jianing
    Yuan, Wu
    Lam, Kyle
    LANCET REGIONAL HEALTH-WESTERN PACIFIC, 2024, 45
  • [26] Visual cognition in multimodal large language models
    Buschoff, Luca M. Schulze
    Akata, Elif
    Bethge, Matthias
    Schulz, Eric
    NATURE MACHINE INTELLIGENCE, 2025, 7 (01) : 96 - 106
  • [27] Multimodal large language models for bioimage analysis
    Zhang, Shanghang
    Dai, Gaole
    Huang, Tiejun
    Chen, Jianxu
    NATURE METHODS, 2024, 21 (08) : 1390 - 1393
  • [28] Detection and Analysis Model for Grammatical Facial Expressions in Sign Language
    Bhuvan, M. S.
    Rao, Vinay D.
    Jain, Siddharth
    Ashwin, T. S.
    Guddetti, Ram Mohana Reddy
    Kulgod, Sutej Pramod
    2016 IEEE REGION 10 SYMPOSIUM (TENSYMP), 2016, : 155 - 160
  • [29] Large language models for depression prediction
    Wang, Yu
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2024, 121 (31)
  • [30] Verbal lie detection using Large Language Models
    Loconte, Riccardo
    Russo, Roberto
    Capuozzo, Pasquale
    Pietrini, Pietro
    Sartori, Giuseppe
    SCIENTIFIC REPORTS, 2023, 13 (01)