A Large Scale Speech Sentiment Corpus

被引:0
|
作者
Chen, Eric Y. [1 ]
Lu, Zhiyun [2 ]
Xu, Hao [1 ]
Cao, Liangliang [1 ]
Zhang, Yu [1 ]
Fan, James [1 ]
机构
[1] Google Inc, New York, NY 10011 USA
[2] Univ Southern Calif, Los Angeles, CA 90007 USA
关键词
sentiment; switchboard; multimodal; speech;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
We present a multimodal corpus for sentiment analysis based on the existing Switchboard-1 Telephone Speech Corpus released by the Linguistic Data Consortium. This corpus extends the Switchboard-1 Telephone Speech Corpus by adding sentiment labels from 3 different human annotators for every transcript segment. Each sentiment label can be one of three options: positive, negative, and neutral. Annotators are recruited using Google Cloud's data labeling service and the labeling task was conducted over the internet. The corpus contains a total of 49500 labeled utterances covering 140 hours of audio. To the best of our knowledge, this is the largest multimodal Corpus for sentiment analysis that includes both speech and text features.
引用
收藏
页码:6549 / 6555
页数:7
相关论文
共 50 条
  • [1] DIDISPEECH: A LARGE SCALE MANDARIN SPEECH CORPUS
    Guo, Tingwei
    Wen, Cheng
    Jiang, Dongwei
    Luo, Ne
    Zhang, Ruixiong
    Zhao, Shuaijiang
    Li, Wubo
    Gong, Cheng
    Zou, Wei
    Han, Kun
    Li, Xiangang
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6968 - 6972
  • [2] Expanding Chinese sentiment dictionaries from large scale unlabeled corpus
    Xu, Hongzhi
    Zhao, Kai
    Qiu, Likun
    Hu, Changjian
    PROCEEDINGS OF THE 24TH PACIFIC ASIA CONFERENCE ON LANGUAGE, INFORMATION AND COMPUTATION, 2010, : 301 - 310
  • [3] Expanding Chinese sentiment dictionaries from large scale unlabeled corpus
    Xu, Hongzhi
    Zhao, Kai
    Qiu, Likun
    Hu, Changjian
    PACLIC 24 - Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation, 2010, : 301 - 310
  • [4] Problems on large-scale speech corpus and the applications in TTS
    Zhang S.
    Liu L.
    Diao L.-H.
    Jisuanji Xuebao/Chinese Journal of Computers, 2010, 33 (04): : 687 - 696
  • [5] Development of a Large-Scale Mandarin Radio Speech Corpus
    Chang, Yung-hsiang Shawn
    Liao, Yuan-fu
    Wang, Sheng-ming
    Wang, Jenq-haur
    Wang, Sing-yue
    Chen, Jhih-wei
    Chen, You-dian
    2017 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS - TAIWAN (ICCE-TW), 2017,
  • [6] QASR: QCRI aljazeera speech resource a large scale annotated Arabic speech corpus
    Mubarak, Hamdy
    Hussein, Amir
    Chowdhury, Shammur Absar
    Ali, Ahmed
    ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference, 2021, : 2274 - 2285
  • [7] SpeechMatrix: A Large-Scale Mined Corpus of Multilingual Speech-to-Speech Translations
    Duquenne, Paul-Ambroise
    Gong, Hongyu
    Dong, Ning
    Du, Jingfei
    Lee, Ann
    Goswami, Vedanuj
    Wang, Changhan
    Pino, Juan
    Sagot, Benoit
    Schwenk, Holger
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 16251 - 16269
  • [8] QASR: QCRI Aljazeera Speech Resource A Large Scale Annotated Arabic Speech Corpus
    Mubarak, Hamdy
    Hussein, Amir
    Chowdhury, Shammur Absar
    Ali, Ahmed
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 2274 - 2285
  • [9] Automatic Speech Recognition of Vietnamese for a New Large-Scale Corpus
    Tran, Linh Thi Thuc
    Kim, Han-Gyu
    La, Hoang Minh
    Pham, Su Van
    ELECTRONICS, 2024, 13 (05)
  • [10] HKUST/MTS: A very large scale Mandarin Telephone Speech Corpus
    Liu, Yi
    Fung, Pascale
    Yang, Yongsheng
    Cieri, Christopher
    Huang, Shudong
    Graff, David
    CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2006, 4274 : 724 - +