Harnessing multimodal approaches for depression detection using large language models and facial expressions

被引:0
|
作者
Misha Sadeghi [1 ]
Robert Richer [1 ]
Bernhard Egger [2 ]
Lena Schindler-Gmelch [3 ]
Lydia Helene Rupp [3 ]
Farnaz Rahimi [1 ]
Matthias Berking [3 ]
Bjoern M. Eskofier [1 ]
机构
[1] Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU),Machine Learning and Data Analytics Lab (MaD Lab), Department Artificial Intelligence in Biomedical Engineering (AIBE)
[2] Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU),Chair of Visual Computing (LGDV), Department of Computer Science
[3] Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU),Chair of Clinical Psychology and Psychotherapy (KliPs)
[4] Institute of AI for Health,Translational Digital Health Group
[5] Helmholtz Zentrum München - German Research Center for Environmental Health,undefined
来源
关键词
D O I
10.1038/s44184-024-00112-8
中图分类号
学科分类号
摘要
Detecting depression is a critical component of mental health diagnosis, and accurate assessment is essential for effective treatment. This study introduces a novel, fully automated approach to predicting depression severity using the E-DAIC dataset. We employ Large Language Models (LLMs) to extract depression-related indicators from interview transcripts, utilizing the Patient Health Questionnaire-8 (PHQ-8) score to train the prediction model. Additionally, facial data extracted from video frames is integrated with textual data to create a multimodal model for depression severity prediction. We evaluate three approaches: text-based features, facial features, and a combination of both. Our findings show the best results are achieved by enhancing text data with speech quality assessment, with a mean absolute error of 2.85 and root mean square error of 4.02. This study underscores the potential of automated depression detection, showing text-only models as robust and effective while paving the way for multimodal analysis.
引用
收藏
相关论文
共 50 条
  • [1] Understanding Naturalistic Facial Expressions with Deep Learning and Multimodal Large Language Models
    Bian, Yifan
    Kuester, Dennis
    Liu, Hui
    Krumhuber, Eva G.
    SENSORS, 2024, 24 (01)
  • [2] Contextual Object Detection with Multimodal Large Language Models
    Zang, Yuhang
    Li, Wei
    Han, Jun
    Zhou, Kaiyang
    Loy, Chen Change
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2025, 133 (02) : 825 - 843
  • [3] Detection of facial expressions of emotions in depression
    Suslow, T
    Junghanns, K
    Arolt, V
    PERCEPTUAL AND MOTOR SKILLS, 2001, 92 (03) : 857 - 868
  • [4] Harnessing the Power of Large Language Models
    Hofmann, Meike
    Burch, Gerald F.
    Burch, Jana J.
    ISACA Journal, 2024, 1 : 32 - 39
  • [5] Harnessing multimodal large language models for traffic knowledge graph generation and decision-making
    Kuang, Senyun
    Liu, Yang
    Wang, Xin
    Wu, Xinhua
    Wei, Yintao
    COMMUNICATIONS IN TRANSPORTATION RESEARCH, 2024, 4
  • [6] Using Augmented Small Multimodal Models to Guide Large Language Models for Multimodal Relation Extraction
    He, Wentao
    Ma, Hanjie
    Li, Shaohua
    Dong, Hui
    Zhang, Haixiang
    Feng, Jie
    APPLIED SCIENCES-BASEL, 2023, 13 (22):
  • [7] Towards Emotion Detection in Educational Scenarios from Facial Expressions and Body Movements through Multimodal Approaches
    Saneiro, Mar
    Santos, Olga C.
    Salmeron-Majadas, Sergio
    Boticario, Jesus G.
    SCIENTIFIC WORLD JOURNAL, 2014,
  • [8] Harnessing Large Language Models for Software Vulnerability Detection: A Comprehensive Benchmarking Study
    Tamberg, Karl
    Bahsi, Hayretdin
    IEEE ACCESS, 2025, 13 : 29698 - 29717
  • [9] InteraRec: Interactive Recommendations Using Multimodal Large Language Models
    Karra, Saketh Reddy
    Tulabandhula, Theja
    TRENDS AND APPLICATIONS IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2024 WORKSHOPS, RAFDA AND IWTA, 2024, 14658 : 32 - 43
  • [10] Harnessing Large Language Models for Chart Review
    Xu, Dongchu
    Cunningham, Jonathan W.
    JOURNAL OF THE AMERICAN HEART ASSOCIATION, 2025, 14 (07):