A New Neural Beamformer for Multi-channel Speech Separation

被引:0
|
作者
Ruqiao Liu
Yi Zhou
Hongqing Liu
Xinmeng Xu
Jie Jia
Binbin Chen
机构
[1] Chongqing University of Posts and Telecommunications,School of Communication and Information Engineering
[2] Vivo AI Lab,E.E. Engineering
[3] Trinity College Dublin,undefined
来源
关键词
Multi-channel speech separation; Neural beamformer; Microphone array; Attention mechanism;
D O I
暂无
中图分类号
学科分类号
摘要
Speech separation is the key to many speech backend tasks, like multi-speaker speech recognition. In recent years, with the development and aid of deep learning technology, many single-channel speech separation models have shown good performance in weak reverberant environment. However, with the presence of reverberation, the multi-channel speech separation model still has greater advantages. Among them, the deep neural network (DNN) based beamformers (also known as neural beamformers) have achieved significant improvements in separation quality. The current neural beamformers can’t jointly optimize beamforming layers and DNN layers when using the prior knowledge of the existing beamforming algorithms, which may make the model unable to obtain the optimal separation performance. In order to solve this problem, this paper employs a set of beamformers that uniformly sample the space as a learning module in the neural network, and the initial values of their coefficients are determined by the existing maximum directivity factor (DF) beamformer. Furthermore, to obtain beam representations of source signals when their directions are unknown, a cross-attention mechanism is introduced. The experimental results show that in the separation task with reverberation, the proposed method has better performance than the current state-of-the-art temporal neural beamformer filter-and-sum network (FasNet) and several mainstream multi-channel speech separation approaches in terms of scale-invariant signal-to-noise ratio (SI-SNR), perceptual evaluation of speech quality (PESQ) and short-time objective intelligibility measure (STOI).
引用
收藏
页码:977 / 987
页数:10
相关论文
共 50 条
  • [1] A New Neural Beamformer for Multi-channel Speech Separation
    Liu, Ruqiao
    Zhou, Yi
    Liu, Hongqing
    Xu, Xinmeng
    Jia, Jie
    Chen, Binbin
    JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2022, 94 (10): : 977 - 987
  • [2] A Pre-Separation and All-Neural Beamformer Framework for Multi-Channel Speech Separation
    Xie, Wupeng
    Xiang, Xiaoxiao
    Zhang, Xiaojuan
    Liu, Guanghong
    SYMMETRY-BASEL, 2023, 15 (02):
  • [3] DFBNet: Deep Neural Network based Fixed Beamformer for Multi-channel Speech Separation
    Liu, Ruqiao
    Zhou, Yi
    Liu, Hongqing
    Xu, Xinmeng
    Jia, Jie
    Chen, Binbin
    2021 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS 2021), 2021, : 194 - 198
  • [4] Three-stage hybrid neural beamformer for multi-channel speech enhancement
    Kuang, Kelan
    Yang, Feiran
    Li, Junfeng
    Yang, Jun
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2023, 153 (06): : 3378 - 3389
  • [5] Locate and Beamform: Two-dimensional Locating All-neural Beamformer for Multi-channel Speech Separation
    Fu, Yanjie
    Ge, Meng
    Wang, Honglong
    Li, Nan
    Yin, Haoran
    Wang, Longbiao
    Zhang, Gaoyan
    Dang, Jianwu
    Deng, Chengyun
    Wang, Fei
    INTERSPEECH 2023, 2023, : 3789 - 3793
  • [6] Multi-channel Speech Enhancement Based on the MVDR Beamformer and Postfilter
    Wang, Dujuan
    Bao, Changchun
    2020 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMMUNICATIONS AND COMPUTING (IEEE ICSPCC 2020), 2020,
  • [7] Iteratively Refined Multi-Channel Speech Separation
    Zhang, Xu
    Bao, Changchun
    Yang, Xue
    Zhou, Jing
    APPLIED SCIENCES-BASEL, 2024, 14 (14):
  • [8] Multi-Modal Multi-Channel Target Speech Separation
    Gu, Rongzhi
    Zhang, Shi-Xiong
    Xu, Yong
    Chen, Lianwu
    Zou, Yuexian
    Yu, Dong
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2020, 14 (03) : 530 - 541
  • [9] Multi-Channel Speech Enhancement Using Labelled Random Finite Sets and a Neural Beamformer in Cocktail Party Scenario
    Datta, Jayanta
    Firoozabadi, Ali Dehghan
    Zabala-Blanco, David
    Castillo-Soria, Francisco R.
    APPLIED SCIENCES-BASEL, 2025, 15 (06):
  • [10] Multi-channel separation of dynamic speech and sound events
    Fujimura, Takuya
    Scheibler, Robin
    INTERSPEECH 2023, 2023, : 3749 - 3753