A Large Scale Speech Sentiment Corpus

被引:0
|
作者
Chen, Eric Y. [1 ]
Lu, Zhiyun [2 ]
Xu, Hao [1 ]
Cao, Liangliang [1 ]
Zhang, Yu [1 ]
Fan, James [1 ]
机构
[1] Google Inc, New York, NY 10011 USA
[2] Univ Southern Calif, Los Angeles, CA 90007 USA
关键词
sentiment; switchboard; multimodal; speech;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
We present a multimodal corpus for sentiment analysis based on the existing Switchboard-1 Telephone Speech Corpus released by the Linguistic Data Consortium. This corpus extends the Switchboard-1 Telephone Speech Corpus by adding sentiment labels from 3 different human annotators for every transcript segment. Each sentiment label can be one of three options: positive, negative, and neutral. Annotators are recruited using Google Cloud's data labeling service and the labeling task was conducted over the internet. The corpus contains a total of 49500 labeled utterances covering 140 hours of audio. To the best of our knowledge, this is the largest multimodal Corpus for sentiment analysis that includes both speech and text features.
引用
收藏
页码:6549 / 6555
页数:7
相关论文
共 50 条
  • [21] The SWARA Speech Corpus: A Large Parallel Romanian Read Speech Dataset
    Stan, Adriana
    Dinescu, Florina
    Tiple, Cristina
    Meza, Serban
    Orza, Bogdan
    Chirila, Magdalena
    Giurgiu, Mircea
    2017 INTERNATIONAL CONFERENCE ON SPEECH TECHNOLOGY AND HUMAN-COMPUTER DIALOGUE (SPED), 2017,
  • [22] A Large Scale Corpus of Gulf Arabic
    Khalifa, Salam
    Habash, Nizar
    Abdulrahim, Dana
    Hassan, Sara
    LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 4282 - 4289
  • [23] Transfer Learning Framework for Low-Resource Text-to-Speech using a Large-Scale Unlabeled Speech Corpus
    Kim, Minchan
    Jeong, Myeonghun
    Choi, Byoung Jin
    Ahn, Sunghwan
    Lee, Joun Yeop
    Kim, Nam Soo
    INTERSPEECH 2022, 2022, : 788 - 792
  • [24] Exploiting the large-scale German Broadcast Corpus to boost the Fraunhofer IAIS Speech Recognition System
    Stadtschnitzer, Michael
    Schwenninger, Jochen
    Stein, Daniel
    Koehler, Joachim
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 3887 - 3890
  • [25] Statistical Analyses of Missing Translations in Simultaneous Interpretation Using a Large-scale Bilingual Speech Corpus
    Cai, Zhongxi
    Ryu, Koichiro
    Matsubara, Shigeki
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 4282 - 4288
  • [26] Towards robust spoken dialogue systems using large-scale in-car speech corpus
    Yamaguchi, Yukiko
    Hayashi, Keita
    Ono, Takahiro
    Kato, Shingo
    Irie, Yuki
    Ohno, Tomohiro
    Murao, Hiroya
    Matsubara, Shigeki
    Kawaguchi, Nobuo
    Takeda, Kazuya
    ADVANCES FOR IN-VEHICLE AND MOBILE SYSTEMS: CHALLENGES FOR INTERNATIONAL STANDARDS, 2007, : 211 - 222
  • [27] Large Scale Sentiment Learning with Limited Labels
    Iosifidis, Vasileios
    Ntoutsi, Eirini
    KDD'17: PROCEEDINGS OF THE 23RD ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2017, : 1823 - 1832
  • [28] Sentiment Diffusion in Large Scale Social Networks
    Tang, Jie
    Fong, Acm
    2013 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE), 2013, : 244 - +
  • [29] Construction and evaluation of a large in-car speech corpus
    Takeda, K
    Fujimura, H
    Itou, K
    Kawaguchi, N
    Matsubara, S
    Itakura, F
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2005, E88D (03): : 553 - 561
  • [30] The link in familiar spontaneous speech: a study of large corpus
    Adda-Decker, Martine
    Fougeron, Cecile
    Gendrot, Cedric
    Lamel, Lori
    Delais-Roussarie, Elisabeth
    REVUE FRANCAISE DE LINGUISTIQUE APPLIQUEE, 2012, 17 (01): : 113 - 128