A Large Scale Speech Sentiment Corpus

被引：0

作者：

Chen, Eric Y. ^{[1
]}

Lu, Zhiyun ^{[2
]}

Xu, Hao ^{[1
]}

Cao, Liangliang ^{[1
]}

Zhang, Yu ^{[1
]}

Fan, James ^{[1
]}

机构：

[1] Google Inc, New York, NY 10011 USA

[2] Univ Southern Calif, Los Angeles, CA 90007 USA

来源：

PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020) | 2020年

关键词：

sentiment; switchboard; multimodal; speech;

D O I：

暂无

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

We present a multimodal corpus for sentiment analysis based on the existing Switchboard-1 Telephone Speech Corpus released by the Linguistic Data Consortium. This corpus extends the Switchboard-1 Telephone Speech Corpus by adding sentiment labels from 3 different human annotators for every transcript segment. Each sentiment label can be one of three options: positive, negative, and neutral. Annotators are recruited using Google Cloud's data labeling service and the labeling task was conducted over the internet. The corpus contains a total of 49500 labeled utterances covering 140 hours of audio. To the best of our knowledge, this is the largest multimodal Corpus for sentiment analysis that includes both speech and text features.

引用

页码：6549 / 6555

页数：7

共 50 条

[41] SloParl - Slovenian Parliamentary speech and text corpus for large vocabulary continuous speech recognition
Zgank, Andrej
Rotovnik, Tomaz
Grasic, Matej
Kos, Marko
Vlaj, Damjan
Kacic, Zdravko
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 197 - 200
[42] AHUMADA: A large speech corpus in Spanish for speaker identification and verification
Ortega-Garcia, J
Gonzalez-Rodriguez, J
Marrero-Aguiar, V
Diaz-Gomez, JJ
Garcia-Jimenez, R
Lucena-Molina, J
Sanchez-Molero, JAG
PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 773 - 776
[43] AHUMADA: A large speech corpus in Spanish for speaker characterization and identification
Ortega-Garcia, J
Gonzalez-Rodriguez, J
Marrero-Aguiar, V
SPEECH COMMUNICATION, 2000, 31 (2-3) : 255 - 264
[44] ParCzech 3.0: A Large Czech Speech Corpus with Rich Metadata
Kopp, Matyas
Stankov, Vladislav
Kruza, Jan Oldrich
Stranak, Pavel
Bojar, Ondrej
TEXT, SPEECH, AND DIALOGUE, TSD 2021, 2021, 12848 : 293 - 304
[45] Pitch distributions in a very large corpus of spontaneous Finnish speech
Lennes, Mietta
Toivola, Minnaleena
INTERSPEECH 2023, 2023, : 4778 - 4782
[46] The Impact of Arabic Part of Speech Tagging on Sentiment Analysis: A New Corpus and Deep Learning Approach
Nerabie, Abdul Munem
AlKhatib, Manar
Mathew, Sujith Samuel
El Barachi, May
Oroumchian, Farhad
12TH INTERNATIONAL CONFERENCE ON AMBIENT SYSTEMS, NETWORKS AND TECHNOLOGIES (ANT) / THE 4TH INTERNATIONAL CONFERENCE ON EMERGING DATA AND INDUSTRY 4.0 (EDI40) / AFFILIATED WORKSHOPS, 2021, 184 : 148 - 155
[47] Low-resource cross-domain product review sentiment classification based on a CNN with an auxiliary large-scale corpus
Wei X.
Lin H.
Yu Y.
Yang L.
Wei, Xiaocong (weixiaocong@dlufl.edu.cn), 1600, MDPI AG (10):
[48] A sentiment corpus for the cryptocurrency financial domain: the CryptoLin corpus
Gadi, Manoel Fernando Alonso
Sicilia, Miguel Angel
LANGUAGE RESOURCES AND EVALUATION, 2024,
[49] BeSt: The Belief and Sentiment Corpus
Tracey, Jennifer
Rambow, Owen
Arrigo, Michael
Cardie, Claire
Dalton, Adam
Dang, Hoa
Diab, Mona
Dorr, Bonnie
Guthrie, Louise
Markowska, Magdalena
Muresan, Smaranda
Prabhakaran, Vinodkumar
Shaikh, Samira
Strzalkowski, Tomek
Wiebe, Janyce
LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 2460 - 2467
[50] A Large Scale Test Corpus for Semantic Table Search
Leventidis, Aristotelis
Christensen, Martin Pekar
Lissandrini, Matteo
Di Rocco, Laura
Hose, Katja
Miller, Renee J.
PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024, 2024, : 1142 - 1151

← 1 2 3 4 5 →