Emotional Speech Recognition of Holocaust Survivors with Deep Neural Network Models for Russian Language

被引:0
|
作者
Bukreeva, Liudmila [1 ]
Guseva, Daria [1 ]
Dolgushin, Mikhail [1 ]
Evdokimova, Vera [1 ]
Obotnina, Vasilisa [1 ]
机构
[1] St Petersburg State Univ, Univ Skaya Emb 7-9, St Petersburg 199034, Russia
来源
关键词
Question Answering; Corpora; Visual History Archives;
D O I
10.1007/978-3-031-48309-7_6
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Recognition of highly emotional speech remains a challenging case of automatic speech recognition task. The aim of this article is to carry out experiments on highly emotional speech recognition by investigating oral history archives provided by the Yad Vashem foundation. The material consists of elderly peoples' emotional speech full of accents and common language. We analyze and preprocess 26 h of publicly available video interviews with Holocaust survivors. Our objective was to develop a system able to perform emotional speech recognition based on deep neural network models. We present and evaluate the obtained results that contribute to the research field of oral history archives.
引用
收藏
页码:68 / 76
页数:9
相关论文
共 50 条
  • [21] Efficient Training and Evaluation of Recurrent Neural Network Language Models for Automatic Speech Recognition
    Chen, Xie
    Liu, Xunying
    Wang, Yongqiang
    Gales, Mark J. F.
    Woodland, Philip C.
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (11) : 2146 - 2157
  • [22] Bayesian Neural Network Language Modeling for Speech Recognition
    Xue, Boyang
    Hu, Shoukang
    Xu, Junhao
    Geng, Mengzhe
    Liu, Xunying
    Meng, Helen
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 2900 - 2917
  • [23] Converting Neural Network Language Models into Back-off Language Models for Efficient Decoding in Automatic Speech Recognition
    Arisoy, Ebru
    Chen, Stanley F.
    Ramabhadran, Bhuvana
    Sethy, Abhinav
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (01) : 184 - 192
  • [24] CONVERTING NEURAL NETWORK LANGUAGE MODELS INTO BACK-OFF LANGUAGE MODELS FOR EFFICIENT DECODING IN AUTOMATIC SPEECH RECOGNITION
    Arisoy, Ebru
    Chen, Stanley F.
    Ramabhadran, Bhuvana
    Sethy, Abhinav
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 8242 - 8246
  • [26] Primi Speech Recognition Based on Deep Neural Network
    Hu, Wenjun
    Fu, Meijun
    Pan, Wenlin
    2016 IEEE 8TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS (IS), 2016, : 667 - 671
  • [27] Deep Convolutional Neural Network for Arabic Speech Recognition
    Amari, Rafik
    Noubigh, Zouhaira
    Zrigui, Salah
    Berchech, Dhaou
    Nicolas, Henri
    Zrigui, Mounir
    COMPUTATIONAL COLLECTIVE INTELLIGENCE, ICCCI 2022, 2022, 13501 : 120 - 134
  • [28] Donggan speech recognition based on deep neural network
    Xu, Haiyan
    Yang, Hongwu
    You, Yuren
    PROCEEDINGS OF 2019 IEEE 8TH JOINT INTERNATIONAL INFORMATION TECHNOLOGY AND ARTIFICIAL INTELLIGENCE CONFERENCE (ITAIC 2019), 2019, : 354 - 358
  • [29] A NETWORK OF DEEP NEURAL NETWORKS FOR DISTANT SPEECH RECOGNITION
    Ravanelli, Mirco
    Brakel, Philemon
    Omologo, Maurizio
    Bengio, Yoshua
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 4880 - 4884
  • [30] Indonesian speech recognition based on Deep Neural Network
    Yang, Ruolin
    Yang, Jian
    Lu, Yu
    2021 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2021, : 36 - 41