Assessing speaker independence on a speech-based depression level estimation system

被引:20
|
作者
Lopez-Otero, Paula [1 ]
Docio-Fernandez, Laura [1 ]
Garcia-Mateo, Carmen [1 ]
机构
[1] Univ Vigo, AtlantTIC Res Ctr, EE Telecomunicac, Vigo 36310, Spain
关键词
Soft biometrics; Depression; iVectors; Speaker independence; INVENTORY;
D O I
10.1016/j.patrec.2015.05.017
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Soft biometrics refers to traits that provide valuable information about an individual without being sufficient for their authentication, as they lack uniqueness and distinctiveness. This definition includes features related to the psychological state of individuals, such as emotions or mental health disorders like depression. Depression has recently been attracting the attention of speech researchers, with audio/visual emotion challenge (AVEC) 2013 and 2014 organized to encourage researchers to develop approaches to accurately estimate speaker depression level. The evaluation frameworks provided for these evaluations do not take speaker independence into account in experiment design, despite this being an important factor in developing a robust speech based system. We assess the influence of prior knowledge of the speakers in a depression estimation experiment, using an iVector-based state-of-the-art approach to depression level estimation to perform a speaker-dependent experiment and a speaker-independent experiment. We conclude that having previous information about the depression level of a given speaker dramatically improves system performance. Hence, we suggest that experimental frameworks must be carefully designed in order to serve as a genuinely useful resource for the development of robust depression estimation systems. (C) 2015 Elsevier B.V. All rights reserved.
引用
收藏
页码:343 / 350
页数:8
相关论文
共 50 条
  • [1] Avoiding dominance of speaker features in speech-based depression detection
    Zuo, Lishi
    Mak, Man-Wai
    PATTERN RECOGNITION LETTERS, 2023, 173 : 50 - 56
  • [2] Enhancing accuracy and privacy in speech-based depression detection through speaker disentanglement
    Ravi, Vijay
    Wang, Jinhan
    Flint, Jonathan
    Alwan, Abeer
    COMPUTER SPEECH AND LANGUAGE, 2024, 86
  • [3] Speaker normalisation for speech-based emotion detection
    Sethu, Vidhyasaharan
    Ambikairajah, Eliathainby
    Epps, Julien
    PROCEEDINGS OF THE 2007 15TH INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING, 2007, : 611 - +
  • [4] Exploring Modulation Spectrum Features for Speech-Based Depression Level Classification
    Bozkurt, Elif
    Toledo-Ronen, Orith
    Sorin, Alexander
    Hoory, Ron
    15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 1243 - 1247
  • [5] Speech-Based Home Automation System
    Fytrakis, Emmanouil
    Georgoulas, Ioannis
    Part, Jose
    Zhu, Yuting
    BRITISH HCI 2015, 2015, : 271 - 272
  • [6] Speaker-turn aware diarization for speech-based cognitive assessments
    Xu, Sean Shensheng
    Ke, Xiaoquan
    Mak, Man-Wai
    Wong, Ka Ho
    Meng, Helen
    Kwok, Timothy C. Y.
    Gu, Jason
    Zhang, Jian
    Tao, Wei
    Chang, Chunqi
    FRONTIERS IN NEUROSCIENCE, 2024, 17
  • [7] Speech-based Evaluation of Emotions-Depression Correlation
    Verde, Laura
    Campanile, Lelio
    Marulli, Fiammetta
    Marrone, Stefano
    2022 IEEE INTL CONF ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING, INTL CONF ON PERVASIVE INTELLIGENCE AND COMPUTING, INTL CONF ON CLOUD AND BIG DATA COMPUTING, INTL CONF ON CYBER SCIENCE AND TECHNOLOGY CONGRESS (DASC/PICOM/CBDCOM/CYBERSCITECH), 2022, : 324 - 329
  • [8] Investigations of features and estimators for speech-based age estimation
    Tokyo Institute of Technology, Japan
    APSIPA ASC - Asia-Pac. Signal Inf. Process. Assoc. Annu. Summit Conf., (470-473):
  • [9] Speaker Turn Aware Similarity Scoring for Diarization of Speech-Based Cognitive Assessments
    Xu, Sean Shensheng
    Mak, Man-Wai
    Wong, Ka Ho
    Meng, Helen
    Kwok, Timothy C. Y.
    2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2021, : 1299 - 1304
  • [10] Could speaker, gender or age awareness be beneficial in speech-based emotion recognition?
    Sidorov, Maxim
    Schmitt, Alexander
    Semenkin, Eugene
    Minker, Wolfgang
    Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016, 2016, : 61 - 68