Assessing speaker independence on a speech-based depression level estimation system

被引：20

作者：

Lopez-Otero, Paula ^{[1
]}

Docio-Fernandez, Laura ^{[1
]}

Garcia-Mateo, Carmen ^{[1
]}

机构：

[1] Univ Vigo, AtlantTIC Res Ctr, EE Telecomunicac, Vigo 36310, Spain

来源：

PATTERN RECOGNITION LETTERS | 2015年 / 68卷

关键词：

Soft biometrics; Depression; iVectors; Speaker independence; INVENTORY;

D O I：

10.1016/j.patrec.2015.05.017

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Soft biometrics refers to traits that provide valuable information about an individual without being sufficient for their authentication, as they lack uniqueness and distinctiveness. This definition includes features related to the psychological state of individuals, such as emotions or mental health disorders like depression. Depression has recently been attracting the attention of speech researchers, with audio/visual emotion challenge (AVEC) 2013 and 2014 organized to encourage researchers to develop approaches to accurately estimate speaker depression level. The evaluation frameworks provided for these evaluations do not take speaker independence into account in experiment design, despite this being an important factor in developing a robust speech based system. We assess the influence of prior knowledge of the speakers in a depression estimation experiment, using an iVector-based state-of-the-art approach to depression level estimation to perform a speaker-dependent experiment and a speaker-independent experiment. We conclude that having previous information about the depression level of a given speaker dramatically improves system performance. Hence, we suggest that experimental frameworks must be carefully designed in order to serve as a genuinely useful resource for the development of robust depression estimation systems. (C) 2015 Elsevier B.V. All rights reserved.

引用

页码：343 / 350

页数：8

共 50 条

[1] Avoiding dominance of speaker features in speech-based depression detection
Zuo, Lishi
Mak, Man-Wai
PATTERN RECOGNITION LETTERS, 2023, 173 : 50 - 56
[2] Enhancing accuracy and privacy in speech-based depression detection through speaker disentanglement
Ravi, Vijay
Wang, Jinhan
Flint, Jonathan
Alwan, Abeer
COMPUTER SPEECH AND LANGUAGE, 2024, 86
[3] Speaker normalisation for speech-based emotion detection
Sethu, Vidhyasaharan
Ambikairajah, Eliathainby
Epps, Julien
PROCEEDINGS OF THE 2007 15TH INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING, 2007, : 611 - +
[4] Exploring Modulation Spectrum Features for Speech-Based Depression Level Classification
Bozkurt, Elif
Toledo-Ronen, Orith
Sorin, Alexander
Hoory, Ron
15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 1243 - 1247
[5] Speech-Based Home Automation System
Fytrakis, Emmanouil
Georgoulas, Ioannis
Part, Jose
Zhu, Yuting
BRITISH HCI 2015, 2015, : 271 - 272
[6] Speaker-turn aware diarization for speech-based cognitive assessments
Xu, Sean Shensheng
Ke, Xiaoquan
Mak, Man-Wai
Wong, Ka Ho
Meng, Helen
Kwok, Timothy C. Y.
Gu, Jason
Zhang, Jian
Tao, Wei
Chang, Chunqi
FRONTIERS IN NEUROSCIENCE, 2024, 17
[7] Speech-based Evaluation of Emotions-Depression Correlation
Verde, Laura
Campanile, Lelio
Marulli, Fiammetta
Marrone, Stefano
2022 IEEE INTL CONF ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING, INTL CONF ON PERVASIVE INTELLIGENCE AND COMPUTING, INTL CONF ON CLOUD AND BIG DATA COMPUTING, INTL CONF ON CYBER SCIENCE AND TECHNOLOGY CONGRESS (DASC/PICOM/CBDCOM/CYBERSCITECH), 2022, : 324 - 329
[8] Investigations of features and estimators for speech-based age estimation
Tokyo Institute of Technology, Japan
APSIPA ASC - Asia-Pac. Signal Inf. Process. Assoc. Annu. Summit Conf., (470-473):
[9] Speaker Turn Aware Similarity Scoring for Diarization of Speech-Based Cognitive Assessments
Xu, Sean Shensheng
Mak, Man-Wai
Wong, Ka Ho
Meng, Helen
Kwok, Timothy C. Y.
2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2021, : 1299 - 1304
[10] Could speaker, gender or age awareness be beneficial in speech-based emotion recognition?
Sidorov, Maxim
Schmitt, Alexander
Semenkin, Eugene
Minker, Wolfgang
Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016, 2016, : 61 - 68

← 1 2 3 4 5 →