EXPLORING PRACTICAL ASPECTS OF NEURAL MASK-BASED BEAMFORMING FOR FAR-FIELD SPEECH RECOGNITION

被引:0
|
作者
Boeddeker, Christoph [1 ,2 ]
Erdogan, Hakan [1 ]
Yoshioka, Takuya [1 ]
Haeb-Umbach, Reinhold [2 ]
机构
[1] Microsoft AI & Res, Redmond, WA 98052 USA
[2] Paderborn Univ, Dept Commun Engn, Paderborn, Germany
来源
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2018年
关键词
Far-field speech recognition; acoustic beamforming; neural networks; time-frequency masks; online processing;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This work examines acoustic beamformers employing neural networks (NNs) for mask prediction as front-end for automatic speech recognition (ASR) systems for practical scenarios like voice-enabled home devices. To test the versatility of the mask predicting network, the system is evaluated with different recording hardware, different microphone array designs, and different acoustic models of the downstream ASR system. Significant gains in recognition accuracy are obtained in all configurations despite the fact that the NN had been trained on mismatched data. Unlike previous work, the NN is trained on a feature level objective, which gives some performance advantage over a mask related criterion. Furthermore, different approaches for realizing online, or adaptive, NN-based beamforming are explored, where the online algorithms still show significant gains compared to the baseline performance.
引用
收藏
页码:6697 / 6701
页数:5
相关论文
共 50 条
  • [21] Far-Field Speech Enhancement using Heteroscedastic Autoencoder for Improved Speech Recognition
    Kumar, Shashi
    Rath, Shakti P.
    INTERSPEECH 2019, 2019, : 446 - 450
  • [22] Deep neural network-based generalized sidelobe canceller for dual-channel far-field speech recognition
    Li, Guanjun
    Liang, Shan
    Nie, Shuai
    Liu, Wenju
    Yang, Zhanlei
    NEURAL NETWORKS, 2021, 141 : 225 - 237
  • [23] AN END-TO-END FAR-FIELD KEYWORD SPOTTING SYSTEM WITH NEURAL BEAMFORMING
    Ji, Xuan
    Lu, Lu
    Fang, Fuming
    Ma, Jianbo
    Zhu, Lei
    Li, Jinke
    Zhao, Dongdi
    Liu, Ming
    Jiang, Feijun
    2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 892 - 899
  • [24] Feature mapping using far-field microphones for distant speech recognition
    Himawan, Ivan
    Motlicek, Petr
    Imseng, David
    Sridharan, Sridha
    SPEECH COMMUNICATION, 2016, 83 : 1 - 9
  • [25] A Study on Improving Acoustic Model for Robust and Far-Field Speech Recognition
    Xue, Shaofei
    Yan, Zhijie
    Yu, Tao
    Liu, Zhang
    2018 IEEE 23RD INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP), 2018,
  • [26] FAR-FIELD SPEECH RECOGNITION BASED ON COMPLEX-VALUED NEURAL NETWORKS AND INTER-FRAME SIMILARITY DIFFERENCE METHOD
    Guo, Yifan
    Chen, Yifan
    Cheng, Gaofeng
    Zhang, Pengyuan
    Yan, Yonghong
    2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 1003 - 1010
  • [27] Front-end for Far-field Speech Recognition based on Frequency Domain Linear Prediction
    Ganapathy, Sriram
    Thomas, Samuel
    Hermansky, Hynek
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 984 - +
  • [28] Curriculum Learning based approaches for robust end-to-end far-field speech recognition
    Ranjan, Shivesh
    Hansen, John H. L.
    SPEECH COMMUNICATION, 2021, 132 : 123 - 131
  • [29] MULTICHANNEL AUDIO FRONT-END FOR FAR-FIELD AUTOMATIC SPEECH RECOGNITION
    Chhetri, Amit
    Hilmes, Philip
    Kristjansson, Trausti
    Chu, Wai
    Mansour, Mohamed
    Li, Xiaoxue
    Zhang, Xianxian
    2018 26TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2018, : 1527 - 1531
  • [30] Introduction to the Issue on Far-Field Speech Processing in the Era of Deep Learning: Speech Enhancement, Separation, and Recognition
    Watanabe, Shinji
    Araki, Shoko
    Bacchiani, Michiel
    Haeb-Umbach, Reinhold
    Seltzer, Michael L.
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2019, 13 (04) : 785 - 786