Exploring Modulation Spectrum Features for Speech-Based Depression Level Classification

被引:0
|
作者
Bozkurt, Elif [1 ]
Toledo-Ronen, Orith [2 ]
Sorin, Alexander [2 ]
Hoory, Ron [2 ]
机构
[1] Koc Univ, Multimedia Vis & Graph Lab, Istanbul, Turkey
[2] Haifa Univ Mt Carmel, IBM Res Haifa, Haifa, Israel
来源
15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4 | 2014年
关键词
depression assessment; modulation spectrum; prosody; feature fusion; decision fusion;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a Modulation Spectrum-based manageable feature set for detection of depressed speech. Modulation Spectrum (MS) is obtained from the conventional speech spectrogram by spectral analysis along the temporal trajectories of the acoustic frequency bins. While MS representation of speech provides rich and high-dimensional joint frequency information, extraction of discriminative features from it remains as an open question. We propose a lower dimensional representation, which first employs a Mel frequency filterbank in the acoustic frequency domain and Discrete Cosine Transform in the modulation frequency domain, and then applies feature selection in both domains. We compare and fuse the proposed feature set with other complementary prosodic and spectral features at the feature and decision levels. In our experiments, we use Support Vector Machines for discriminating the depressed speech in a speaker-independent fashion. Feature-level fusion of the proposed MS-based features with other prosodic and spectral features after dimension reduction provides up to 9% improvement over the baseline results and also correlates the most with clinical ratings of patients' depression level.
引用
收藏
页码:1243 / 1247
页数:5
相关论文
共 50 条
  • [31] Enhanced multiclass SVM with thresholding fusion for speech-based emotion classification
    Yang N.
    Yuan J.
    Zhou Y.
    Demirkol I.
    Duan Z.
    Heinzelman W.
    Sturge-Apple M.
    International Journal of Speech Technology, 2017, 20 (01) : 27 - 41
  • [32] Automated speech-based screening of depression using deep convolutional neural networks
    Chlasta, Karol
    Wolk, Krzysztof
    Krejtz, Izabela
    CENTERIS2019--INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS/PROJMAN2019--INTERNATIONAL CONFERENCE ON PROJECT MANAGEMENT/HCIST2019--INTERNATIONAL CONFERENCE ON HEALTH AND SOCIAL CARE INFORMATION SYSTEMS AND TECHNOLOGIES, 2019, 164 : 618 - 628
  • [33] Enhancing accuracy and privacy in speech-based depression detection through speaker disentanglement
    Ravi, Vijay
    Wang, Jinhan
    Flint, Jonathan
    Alwan, Abeer
    COMPUTER SPEECH AND LANGUAGE, 2024, 86
  • [34] Empirical mode decomposition based weighted frequency feature for speech-based emotion classification
    Sethu, Vidhyasaharan
    Ambikairajah, Eliathamby
    Epps, Julien
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 5017 - 5020
  • [35] Speech-based Diagnosis of Autism Spectrum Condition by Generative Adversarial Network Representations
    Deng, Jun
    Cummins, Nicholas
    Schmitt, Maximilian
    Qian, Kun
    Ringeval, Fabien
    Schuller, Bjorn
    PROCEEDINGS OF THE 2017 INTERNATIONAL CONFERENCE ON DIGITAL HEALTH (DH'17), 2017, : 53 - 57
  • [36] Fusing features of speech for depression classification based on higher-order spectral analysis
    Miao, Xiaolin
    Li, Yao
    Wen, Min
    Liu, Yongyan
    Julian, Ibegbu Nnamdi
    Guo, Hao
    SPEECH COMMUNICATION, 2022, 143 : 46 - 56
  • [37] Toward Knowledge-Driven Speech-Based Models of Depression: Leveraging Spectrotemporal Variations in Speech Vowels
    Feng, Kexin
    Feng, Kexin
    2022 IEEE-EMBS INTERNATIONAL CONFERENCE ON BIOMEDICAL AND HEALTH INFORMATICS (BHI) JOINTLY ORGANISED WITH THE IEEE-EMBS INTERNATIONAL CONFERENCE ON WEARABLE AND IMPLANTABLE BODY SENSOR NETWORKS (BSN'22), 2022,
  • [38] Contrastive Learning with Multi-level Embeddings for Speech-Based Emotion Recognition
    Si, Mei
    HCI INTERNATIONAL 2024-LATE BREAKING POSTERS, HCII 2024, PT I, 2025, 2319 : 312 - 321
  • [39] Relative importance of speech and voice features in the classification of schizophrenia and depression
    Mark Berardi
    Katharina Brosch
    Julia-Katharina Pfarr
    Katharina Schneider
    Angela Sültmann
    Florian Thomas-Odenthal
    Adrian Wroblewski
    Paula Usemann
    Alexandra Philipsen
    Udo Dannlowski
    Igor Nenadić
    Tilo Kircher
    Axel Krug
    Frederike Stein
    Maria Dietrich
    Translational Psychiatry, 13
  • [40] Relative importance of speech and voice features in the classification of schizophrenia and depression
    Berardi, Mark
    Brosch, Katharina
    Pfarr, Julia-Katharina
    Schneider, Katharina
    Sueltmann, Angela
    Thomas-Odenthal, Florian
    Wroblewski, Adrian
    Usemann, Paula
    Philipsen, Alexandra
    Dannlowski, Udo
    Nenadic, Igor
    Kircher, Tilo
    Krug, Axel
    Stein, Frederike
    Dietrich, Maria
    TRANSLATIONAL PSYCHIATRY, 2023, 13 (01)