The SRI CLEO Speaker-State Corpus

被引:0
|
作者
Kathol, Andreas [1 ]
Shriberg, Elizabeth [1 ]
de Zambotti, Massimilano [1 ]
机构
[1] SRI Int, 333 Ravenswood Ave, Menlo Pk, CA 94025 USA
来源
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES | 2016年
基金
美国国家科学基金会;
关键词
Speech corpora; psychophysiology; autonomic nervous system; speech features; emotion;
D O I
10.21437/Interspeech.2016-1141
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We introduce the SRI CLEO (Conversational Language about Everyday Objects) Speaker-State Corpus of speech, video, and biosignals. The goal of the corpus is providing insight on the speech and physiological changes resulting from subtle, context-based influences on affect and cognition. Speakers were prompted by collections of pictures of neutral everyday objects and were instructed to provide speech related to any subset of the objects for a preset period of time (120 or 180 seconds depending on task). The corpus provides signals for 43 speakers under four different speaker-state conditions: (1) neutral and emotionally charged audiovisual background; (2) cognitive load; (3) time pressure; and (4) various acted emotions. Unlike previous studies that have linked speaker state to the content of the speaking task itself, the CLEO prompts remain largely pragmatically, semantically, and affectively neutral across all conditions. This framework enables for more direct comparisons across both conditions and speakers. The corpus also includes more traditional speaker tasks involving reading and free-form reporting of neutral and emotionally charged content. The explored biosignals include skin conductance, respiration, blood pressure, and ECG. The corpus is in the final stages of processing and will be made available to the research community.
引用
收藏
页码:1541 / 1544
页数:4
相关论文
共 50 条
  • [31] Speaker Recognition Benchmark using the CHiME-5 Corpus
    Garcia-Romero, Daniel
    Snyder, David
    Watanabe, Shinji
    Sell, Gregory
    McCree, Alan
    Povey, Daniel
    Khudanpur, Sanjeev
    INTERSPEECH 2019, 2019, : 1506 - 1510
  • [32] New background speaker models and experiments on the ANDOSL speech corpus
    Tran, D
    Sharma, D
    KNOWLEDGE-BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT 2, PROCEEDINGS, 2004, 3214 : 498 - 503
  • [33] Speaker recognition based on multilevel speech signal analysis on Polish corpus
    Szymon Drgas
    Adam Dabrowski
    Multimedia Tools and Applications, 2015, 74 : 4195 - 4211
  • [34] Speaker identification based on complete feature corpus and evaluation of mutual information
    YU Yibiao WANG Shuozhong (School of Electronic Information Engineering
    Chinese Journal of Acoustics, 2005, (03) : 280 - 288
  • [35] Tandem Features for Text-dependent Speaker Verification on the RedDots Corpus
    Alam, Md Jahangir
    Kenny, Patrick
    Gupta, Vishwa
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 420 - 424
  • [36] The effects of handset variability on speaker recognition performance: Experiments on the switchboard corpus
    Reynolds, DA
    1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 113 - 116
  • [37] Speaker Verification Experiments using Identity Vectors, on a Romanian Speakers Corpus
    Novac, Oana-Mariana
    Toma, Stefan-Adrian
    Bureaca, Emil
    2021 INTERNATIONAL CONFERENCE ON SPEECH TECHNOLOGY AND HUMAN-COMPUTER DIALOGUE (SPED), 2021, : 162 - 166
  • [38] Comparative Study for Multi-Speaker Mongolian TTS with a New Corpus
    Liang, Kailin
    Liu, Bin
    Hu, Yifan
    Liu, Rui
    Bao, Feilong
    Gao, Guanglai
    APPLIED SCIENCES-BASEL, 2023, 13 (07):
  • [39] AISHELL-3: A Multi-Speaker Mandarin TTS Corpus
    Shi, Yao
    Bu, Hui
    Xu, Xin
    Zhang, Shaoji
    Li, Ming
    INTERSPEECH 2021, 2021, : 2756 - 2760
  • [40] Speaker Recognition Based on Multilevel Speech Signal Analysis on Polish Corpus
    Drgas, Szymon
    Dabrowski, Adam
    MULTIMEDIA COMMUNICATIONS, SERVICES AND SECURITY, 2012, 287 : 85 - 94