BUILDING A CHINESE NATURAL EMOTIONAL AUDIO-VISUAL DATABASE

被引:0
|
作者
Bao, Wei [1 ,2 ]
Li, Ya [2 ]
Gu, Mingliang [1 ]
Yang, Minghao [2 ]
Li, Hao [2 ]
Chao, Linlin [2 ]
Tao, Jianhua [2 ]
机构
[1] Jiangsu Normal Univ, Inst Linguist Sci, Xuzhou, Peoples R China
[2] Chinese Acad Sci, Inst Automat, Beijing, Peoples R China
关键词
Audio-visual database; spontaneous emotion; emotional corpus annotation;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Building a spontaneous, multi-modal, rich-annotated emotion database is a challenging work. Although there have been a growing number of emotional corpora available, most of them were recorded in lab controlled' environment. This paper presents a recently collected database, CASIA Natural Emotional Audio-Visual Database. This corpus contains two hours spontaneous emotional segments extracted from 219 speakers from films, TV plays and talk shows. The number of the speakers of the corpus makes this database a valuable addition to the existing emotional databases. In total, 24 non-prototypical emotional states are labeled by three first Chinese native speakers. In contrast to other available emotional databases, we provided multi-emotion labels and fake/suppressed emotion labels. To our best knowledge, this database is the first large-scale Chinese natural emotion corpus dealing with multi-modal and natural emotion.
引用
收藏
页码:583 / 587
页数:5
相关论文
共 50 条
  • [21] An audio-visual database for evaluating person tracking algorithms
    Krinidis, M
    Stamou, G
    Teutsch, H
    Spors, S
    Nikolaidis, N
    Rabenstein, R
    Pitas, L
    2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 237 - 240
  • [22] A New Audio-Visual Database to Represent Urban Path
    Qing, Ji
    INFORMATION AND BUSINESS INTELLIGENCE, PT I, 2012, 267 : 713 - 719
  • [23] The Audio-Visual Arabic Dataset for Natural Emotions
    Abu Shaqra, Ftoon
    Duwairi, Rehab
    Al-Ayyoub, Mahmoud
    2019 7TH INTERNATIONAL CONFERENCE ON FUTURE INTERNET OF THINGS AND CLOUD (FICLOUD 2019), 2019, : 324 - 329
  • [24] MindData for Enhanced Entertainment: Building a Comprehensive EEG Dataset of Emotional Responses to Audio-Visual Stimuli
    Thejaswini, M. S.
    Kumar, G. Hemantha
    Aradhya, V. N. Manjunath
    Narendra, R.
    Suresha, M.
    Guru, D. S.
    APPLIED INTELLIGENCE AND INFORMATICS, AII 2023, 2024, 2065 : 82 - 94
  • [25] Audio-Visual Speech Synthesis Based on Chinese Visual Triphone
    Zhao, Hui
    Chen, Yue-bing
    Shen, Ya-min
    Tang, Chao-jing
    PROCEEDINGS OF THE 2009 2ND INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, VOLS 1-9, 2009, : 3316 - 3320
  • [26] Building a data corpus for audio-visual speech recognition
    Chitu, Alin G.
    Rothkrantz, Leon J. M.
    EUROMEDIA '2007, 2007, : 88 - 92
  • [27] An audio-visual distance for audio-visual speech vector quantization
    Girin, L
    Foucher, E
    Feng, G
    1998 IEEE SECOND WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING, 1998, : 523 - 528
  • [28] Catching audio-visual mice:: The extrapolation of audio-visual speed
    Hofbauer, MM
    Wuerger, SM
    Meyer, GF
    Röhrbein, F
    Schill, K
    Zetzsche, C
    PERCEPTION, 2003, 32 : 96 - 96
  • [30] Multiple Classifier Systems for the Classification of Audio-Visual Emotional States
    Glodek, Michael
    Tschechne, Stephan
    Layher, Georg
    Schels, Martin
    Brosch, Tobias
    Scherer, Stefan
    Kaechele, Markus
    Schmidt, Miriam
    Neumann, Heiko
    Palm, Guenther
    Schwenker, Friedhelm
    AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION, PT II, 2011, 6975 : 359 - 368