A COMPLETE END-TO-END SPEAKER VERIFICATION SYSTEM USING DEEP NEURAL NETWORKS: FROM RAW SIGNALS TO VERIFICATION RESULT

被引:0
|
作者
Jung, Jee-Weon [1 ]
Heo, Hee-Soo [1 ]
Yang, Il-Ho [1 ]
Shim, Hye-Jin [1 ]
Yu, Ha-Jin [1 ]
机构
[1] Univ Seoul, Sch Comp Sci, Seoul, South Korea
关键词
speaker verification; end-to-end system; raw audio signal;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
End-to-end systems using deep neural networks have been widely studied in the field of speaker verification. Raw audio signal processing has also been widely studied in the fields of automatic music tagging and speech recognition. However, as far as we know, end-to-end systems using raw audio signals have not been explored in speaker verification. In this paper, a complete end-to-end speaker verification system is proposed, which inputs raw audio signals and outputs the verification results. A pre-processing layer and the embedded speaker feature extraction models were mainly investigated. The proposed pre-emphasis layer was combined with a strided convolution layer for pre-processing at the first two hidden layers. In addition, speaker feature extraction models using convolutional layer and long short-term memory are proposed to be embedded in the proposed end-to-end system.
引用
收藏
页码:5349 / 5353
页数:5
相关论文
共 50 条
  • [1] DEEP NEURAL NETWORK-BASED SPEAKER EMBEDDINGS FOR END-TO-END SPEAKER VERIFICATION
    Snyder, David
    Ghahremani, Pegah
    Povey, Daniel
    Garcia-Romero, Daniel
    Carmiel, Yishay
    Khudanpur, Sanjeev
    2016 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2016), 2016, : 165 - 170
  • [2] RawNet: Advanced end-to-end deep neural network using raw waveforms for text-independent speaker verification
    Jung, Jee-weon
    Heo, Hee-Soo
    Kim, Ju-ho
    Shim, Hye-jin
    Yu, Ha-Jin
    INTERSPEECH 2019, 2019, : 1268 - 1272
  • [3] An End-to-End System for Unconstrained Face Verification with Deep Convolutional Neural Networks
    Chen, Jun-Cheng
    Ranjan, Rajeev
    Kumar, Amit
    Chen, Ching-Hui
    Patel, Vishal M.
    Chellappa, Rama
    2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOP (ICCVW), 2015, : 360 - 368
  • [4] Neural PLDA Modeling for End-to-End Speaker Verification
    Ramoji, Shreyas
    Krishnan, Prashant
    Ganapathy, Sriram
    INTERSPEECH 2020, 2020, : 4333 - 4337
  • [5] Shortcut Connections based Deep Speaker Embeddings for End-to-End Speaker Verification System
    Seo, Soonshin
    Rim, Daniel Jun
    Lim, Minkyu
    Lee, Donghyun
    Park, Hosung
    Oh, Junseok
    Kim, Changmin
    Kim, Ji-Hwan
    INTERSPEECH 2019, 2019, : 2928 - 2932
  • [6] Robust End-to-End Speaker Verification Using EEG
    Han, Yan
    Krishna, Gautam
    Tran, Co
    Carnahan, Mason
    Tewfik, Ahmed H.
    28TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2020), 2021, : 1170 - 1174
  • [7] End-To-End Phonetic Neural Network Approach for Speaker Verification
    Demirbag, Sedat
    Erden, Mustafa
    Arslan, Levent
    2020 28TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2020,
  • [8] Improved Relation Networks for End-to-End Speaker Verification and Identification
    Chaubey, Ashutosh
    Sinha, Sparsh
    Ghose, Susmita
    INTERSPEECH 2022, 2022, : 5085 - 5089
  • [9] Analysis of Length Normalization in End-to-End Speaker Verification System
    Cai, Weicheng
    Chen, Jinkun
    Li, Ming
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3618 - 3622
  • [10] Investigating Raw Wave Deep Neural Networks for End-to-End Speaker Spoofing Detection
    Dinkel, Heinrich
    Qian, Yanmin
    Yu, Kai
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (11) : 2002 - 2014