BUILDING A CHINESE NATURAL EMOTIONAL AUDIO-VISUAL DATABASE

被引：0

作者：

Bao, Wei ^{[1
,2
]}

Li, Ya ^{[2
]}

Gu, Mingliang ^{[1
]}

Yang, Minghao ^{[2
]}

Li, Hao ^{[2
]}

Chao, Linlin ^{[2
]}

Tao, Jianhua ^{[2
]}

机构：

[1] Jiangsu Normal Univ, Inst Linguist Sci, Xuzhou, Peoples R China

[2] Chinese Acad Sci, Inst Automat, Beijing, Peoples R China

来源：

2014 12TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP) | 2014年

关键词：

Audio-visual database; spontaneous emotion; emotional corpus annotation;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Building a spontaneous, multi-modal, rich-annotated emotion database is a challenging work. Although there have been a growing number of emotional corpora available, most of them were recorded in lab controlled' environment. This paper presents a recently collected database, CASIA Natural Emotional Audio-Visual Database. This corpus contains two hours spontaneous emotional segments extracted from 219 speakers from films, TV plays and talk shows. The number of the speakers of the corpus makes this database a valuable addition to the existing emotional databases. In total, 24 non-prototypical emotional states are labeled by three first Chinese native speakers. In contrast to other available emotional databases, we provided multi-emotion labels and fake/suppressed emotion labels. To our best knowledge, this database is the first large-scale Chinese natural emotion corpus dealing with multi-modal and natural emotion.

引用

页码：583 / 587

页数：5

共 50 条

[21] An audio-visual database for evaluating person tracking algorithms
Krinidis, M
Stamou, G
Teutsch, H
Spors, S
Nikolaidis, N
Rabenstein, R
Pitas, L
2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 237 - 240
[22] A New Audio-Visual Database to Represent Urban Path
Qing, Ji
INFORMATION AND BUSINESS INTELLIGENCE, PT I, 2012, 267 : 713 - 719
[23] The Audio-Visual Arabic Dataset for Natural Emotions
Abu Shaqra, Ftoon
Duwairi, Rehab
Al-Ayyoub, Mahmoud
2019 7TH INTERNATIONAL CONFERENCE ON FUTURE INTERNET OF THINGS AND CLOUD (FICLOUD 2019), 2019, : 324 - 329
[24] MindData for Enhanced Entertainment: Building a Comprehensive EEG Dataset of Emotional Responses to Audio-Visual Stimuli
Thejaswini, M. S.
Kumar, G. Hemantha
Aradhya, V. N. Manjunath
Narendra, R.
Suresha, M.
Guru, D. S.
APPLIED INTELLIGENCE AND INFORMATICS, AII 2023, 2024, 2065 : 82 - 94
[25] Audio-Visual Speech Synthesis Based on Chinese Visual Triphone
Zhao, Hui
Chen, Yue-bing
Shen, Ya-min
Tang, Chao-jing
PROCEEDINGS OF THE 2009 2ND INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, VOLS 1-9, 2009, : 3316 - 3320
[26] Building a data corpus for audio-visual speech recognition
Chitu, Alin G.
Rothkrantz, Leon J. M.
EUROMEDIA '2007, 2007, : 88 - 92
[27] An audio-visual distance for audio-visual speech vector quantization
Girin, L
Foucher, E
Feng, G
1998 IEEE SECOND WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING, 1998, : 523 - 528
[28] Catching audio-visual mice:: The extrapolation of audio-visual speed
Hofbauer, MM
Wuerger, SM
Meyer, GF
Röhrbein, F
Schill, K
Zetzsche, C
PERCEPTION, 2003, 32 : 96 - 96
[29] Emotional perception of speech sounds under audio-visual presentation
Shigeno, S
INTERNATIONAL JOURNAL OF PSYCHOLOGY, 2000, 35 (3-4) : 53 - 53
[30] Multiple Classifier Systems for the Classification of Audio-Visual Emotional States
Glodek, Michael
Tschechne, Stephan
Layher, Georg
Schels, Martin
Brosch, Tobias
Scherer, Stefan
Kaechele, Markus
Schmidt, Miriam
Neumann, Heiko
Palm, Guenther
Schwenker, Friedhelm
AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION, PT II, 2011, 6975 : 359 - 368

← 1 2 3 4 5 →