Depression detection using cascaded attention based deep learning framework using speech data

被引:0
|
作者
Gupta, Sachi [1 ]
Agarwal, Gaurav [2 ]
Agarwal, Shivani [3 ]
Pandey, Dilkeshwar [4 ]
机构
[1] Galgotias Coll Engn & Technol, Dept Comp Sci & Engn, Greater Noida 201310, Uttar Pradesh, India
[2] Galgotias Univ, Sch Comp Sci & Engn, Gr Noida 203201, Uttar Pradesh, India
[3] Ajay Kumar Garg Engn Coll, Dept Informat Technol, Ghaziabad 201009, Uttar Pradesh, India
[4] KIET Grp Inst, Dept Comp Sci & Engn, Ghaziabad 201206, Uttar Pradesh, India
关键词
Speech signals; Multi-stage Discrete Wavelet Transform; Auction Optimization; Deep convolutional Attention; Depression; And Non-depression;
D O I
10.1007/s11042-023-18076-w
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Efficient detection of depression is a challenging scenario in the field of speech signal processing. Since the speech signals provide a better diagnosis of depression, a significant methodology is required for detection. However, manual examination performed by radiologists can be time-consuming and may not be feasible in complex circumstances. Diverse detection methodologies have been proposed previously, but they are found to be less accurate, time-consuming and lead over maximized error rates. The proposed research article presents an effective and automatic deep learning-based depression detection using speech signal data. The steps involved in depression prediction are data acquisition, pre-processing, Feature Extraction, Feature selection and classification. The initial step in depression detection is data acquisition, which aims at collecting speech signals from the Distress Analysis Interview Corpus (DAIC-WOZ) and Sonde Health-free speech (SH2-FS) datasets. The collected data are pre-processed through MS_DWT (Multi-stage Discrete Wavelet Transform) to offer noise-free signals and improved signal quality. The relevant features required for processing the speech signal are extracted through Hilbert Huang (H-H) transform linear prediction cepstrum coefficient (LPCC), fundamental frequency, formants, speaking rate and Mel frequency cepstral coefficients (MFCC). From the extracted features, ideal features required for enhancing the detection accuracy are selected using the Price Auction optimization algorithm (PAOA). Finally, the depression and non-depression states are classified using deep convolutional Attention Cascaded two directional long short-term memory (DAttn_Conv 2D LSTM) with a softmax classifier. The overall accuracy obtained in classifying the depressed and non-depressed classes is 97.82% and 98.91%, respectively.
引用
收藏
页码:66135 / 66173
页数:39
相关论文
共 50 条
  • [41] A novel deep learning based framework for the detection and classification of breast cancer using transfer learning
    Khan, SanaUllah
    Islam, Naveed
    Jan, Zahoor
    Din, Ikram Ud
    Rodrigues, Joel J. P. C.
    PATTERN RECOGNITION LETTERS, 2019, 125 : 1 - 6
  • [42] PCovNet: A presymptomatic COVID-19 detection framework using deep learning model using wearables data
    Abir, Farhan Fuad
    Alyafei, Khalid
    Chowdhury, Muhammad E. H.
    Khandakar, Amith
    Ahmed, Rashid
    Hossain, Muhammad Maqsud
    Mahmud, Sakib
    Rahman, Ashiqur
    Abbas, Tareq O.
    Zughaier, Susu M.
    Naji, Khalid Kamal
    COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 147
  • [43] Automated Mass Detection in Mammograms using Cascaded Deep Learning and Random Forests
    Dhungel, Neeraj
    Carneiro, Gustavo
    Bradley, Andrew P.
    2015 INTERNATIONAL CONFERENCE ON DIGITAL IMAGE COMPUTING: TECHNIQUES AND APPLICATIONS (DICTA), 2015, : 160 - 167
  • [44] Emotion Recognition from Children Speech Signals Using Attention Based Time Series Deep Learning
    Cao, Guitao
    Tang, Yunming
    Sheng, Jiyu
    Cao, Wenming
    2019 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2019, : 1296 - 1300
  • [45] Dementia Detection from Speech Using Machine Learning and Deep Learning Architectures
    Kumar, M. Rupesh
    Vekkot, Susmitha
    Lalitha, S.
    Gupta, Deepa
    Govindraj, Varasiddhi Jayasuryaa
    Shaukat, Kamran
    Alotaibi, Yousef Ajami
    Zakariah, Mohammed
    SENSORS, 2022, 22 (23)
  • [46] A Cascaded Mutliresolution Ensemble Deep Learning Framework for Large Scale Alzheimer's Disease Detection Using Brain MRIs
    Razzak, Imran
    Naz, Saeeda
    Alinejad-Rokny, Hamid
    Nguyen, Tu N.
    Khalifa, Fahmi
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2024, 21 (04) : 573 - 581
  • [47] A Deep Learning Approach to Predict Weather Data Using Cascaded LSTM Network
    Al Sadeque, Zarif
    Bui, Francis M.
    2020 IEEE CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (CCECE), 2020,
  • [48] A robust framework for spoofing detection in faces using deep learning
    Arora, Shefali
    Bhatia, M. P. S.
    Mittal, Vipul
    VISUAL COMPUTER, 2022, 38 (07): : 2461 - 2472
  • [49] A robust framework for spoofing detection in faces using deep learning
    Shefali Arora
    M. P. S. Bhatia
    Vipul Mittal
    The Visual Computer, 2022, 38 : 2461 - 2472
  • [50] TSception:A Deep Learning Framework for Emotion Detection Using EEG
    Ding, Yi
    Robinson, Neethu
    Zeng, Qiuhao
    Chen, Duo
    Wai, Aung Aung Phyo
    Lee, Tih-Shih
    Guan, Cuntai
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,