A Deep Learning-Based Multimodal Architecture to predict Signs of Dementia

被引:4
|
作者
Ortiz-Perez, David [1 ]
Ruiz-Ponce, Pablo [1 ]
Tomas, David [2 ]
Garcia-Rodriguez, Jose [1 ]
Vizcaya-Moreno, M. Flores [3 ]
Leo, Marco [4 ]
机构
[1] Univ Alicante, Dept Comp Sci & Technol, Carretera San Vicente Raspeig, Alicante 03690, Spain
[2] Univ Alicante, Dept Software & Comp Syst, Carretera San Vicente Raspeig, Alicante 03690, Spain
[3] Univ Alicante, Fac Hlth Sci, Unit Clin Nursing Res, Carretera San Vicente Raspeig, Alicante 03690, Spain
[4] Natl Res Council Italy, Inst Appl Sci & Intelligent Syst, I-73100 Lecce, Italy
关键词
Multimodal; Deep learning; Transformers; Dementia prediction;
D O I
10.1016/j.neucom.2023.126413
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes a multimodal deep learning architecture combining text and audio information to predict dementia, a disease which affects around 55 million people all over the world and makes them in some cases dependent people. The system was evaluated on the DementiaBank Pitt Corpus dataset, which includes audio recordings as well as their transcriptions for healthy people and people with dementia. Different models have been used and tested, including Convolutional Neural Networks (CNN) for audio classification, Transformers for text classification, and a combination of both in a multimodal ensemble. These models have been evaluated on a test set, obtaining the best results by using the text modality, achieving 90.36% accuracy on the task of detecting dementia. Additionally, an analysis of the corpus has been conducted for the sake of explainability, aiming to obtain more information about how the models generate their predictions and identify patterns in the data. & COPY; 2023 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
引用
收藏
页数:10
相关论文
共 50 条
  • [31] Exploration of Deep Learning-based Multimodal Fusion for Semantic Road Scene Segmentation
    Zhang, Yifei
    Morel, Olivier
    Blanchon, Marc
    Seulin, Ralph
    Rastgoo, Mojdeh
    Sidibe, Desire
    PROCEEDINGS OF THE 14TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 5, 2019, : 336 - 343
  • [32] A Deep Learning-Based Multimodal Resource Reconstruction Scheme for Digital Enterprise Management
    Yang, Tingting
    Zheng, Bing
    JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2023, 32 (11)
  • [33] Deep Learning-based Multimodal Control Interface for Human-Robot Collaboration
    Liu, Hongyi
    Fang, Tongtong
    Zhou, Tianyu
    Wang, Yuquan
    Wang, Lihui
    51ST CIRP CONFERENCE ON MANUFACTURING SYSTEMS, 2018, 72 : 3 - 8
  • [34] MUS Model: A Deep Learning-Based Architecture for IoT Intrusion Detection
    Yan, Yu
    Yang, Yu
    Fang, Shen
    Gao, Minna
    Chen, Yiding
    CMC-COMPUTERS MATERIALS & CONTINUA, 2024, 80 (01): : 875 - 896
  • [35] Application of deep learning-based multimodal fusion technology in cancer diagnosis: A survey
    Li, Yan
    Pan, Liangrui
    Peng, Yijun
    Li, Xiaoyu
    Wang, Xiang
    Qu, Limeng
    Song, Qiya
    Liang, Qingchun
    Peng, Shaoliang
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 143
  • [36] A Deep Learning-Based Semantic Segmentation Architecture for Autonomous Driving Applications
    Masood, Sharjeel
    Ahmed, Fawad
    Alsuhibany, Suliman A.
    Ghadi, Yazeed Yasin
    Siyal, M. Y.
    Kumar, Harish
    Khan, Khyber
    Ahmad, Jawad
    WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2022, 2022
  • [37] Deep learning-based classification of dementia using image representation of subcortical signals
    Ranjan, Shivani
    Tripathi, Ayush
    Shende, Harshal
    Badal, Robin
    Kumar, Amit
    Yadav, Pramod
    Joshi, Deepak
    Kumar, Lalan
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2025, 25 (01)
  • [38] Multimodal Architecture for Emotion in Robots using Deep Learning
    Ghayoumi, Mehdi
    Bansal, Arvind K.
    PROCEEDINGS OF 2016 FUTURE TECHNOLOGIES CONFERENCE (FTC), 2016, : 901 - 907
  • [39] EmbraceNet: A robust deep learning architecture for multimodal classification
    Choi, Jun-Ho
    Lee, Jong-Seok
    INFORMATION FUSION, 2019, 51 : 259 - 270
  • [40] DEEP LEARNING-BASED ELECTROCARDIOGRAM ANALYSIS TO PREDICT MORTALITY IN REPAIRED TETRALOGY OF FALLOT
    Van Boxtel, Juul
    Mayourian, Joshua
    Sleeper, Lynn
    Diwanji, Vedang
    Geva, Alon
    O'Leary, Edward
    Triedman, John K.
    Ghelani, Sunil J.
    Valente, Anne Marie
    Geva, Tal
    JOURNAL OF THE AMERICAN COLLEGE OF CARDIOLOGY, 2024, 83 (13) : 1581 - 1581