Microphone Array Speech Enhancement Via Beamforming Based Deep Learning Network

被引:0
|
作者
Pathrose, Jeyasingh [1 ]
Ismail, M. Mohamed [2 ]
Mohan, P. Madhan [3 ]
机构
[1] BS Abdur Rahman Crescent Inst Sci & Technol, Dept Elect & Commun Engn, Chennai 600048, India
[2] BS Abdur Rahman Crescent Inst Sci & Technol, Chennai 600048, India
[3] Jasmin Infotech Pvt Ltd, Chennai 600100, India
关键词
Speech Enhancement; Microphone; Deep Learning; Beamforming; Noise Reduction;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
- In general, in-car speech enhancement is an application of the microphone array speech enhancement in particular acoustic environments. Speech enhancement inside the moving cars is always an interesting topic and the researchers work to create some modules to increase the quality of speech and intelligibility of speech in cars. The passenger dialogue inside the car, the sound of other equipment, and a wide range of interference effects are major challenges in the task of speech separation in-car environment. To overcome this issue, a novel Beamforming based Deep learning Network (Bf-DLN) has been proposed for speech enhancement. Initially, the captured microphone array signals are pre-processed using an Adaptive beamforming technique named Least Constrained Minimum Variance (LCMV). Consequently, the proposed method uses a time-frequency representation to transform the pre-processed data into an image. The smoothed pseudo-Wigner-Ville distribution (SPWVD) is used for converting time-domain speech inputs into images. Convolutional deep belief network (CDBN) is used to extract the most pertinent features from these transformed images. Enhanced Elephant Heard Algorithm (EEHA) is used for selecting the desired source by eliminating the interference source. The experimental result demonstrates the effectiveness of the proposed strategy in removing background noise from the original speech signal. The proposed strategy outperforms existing methods in terms of PESQ, STOI, SSNRI, and SNR. The PESQ of the proposed Bf-DLN has a maximum PESQ of 1.98, whereas existing models like Two-stage Bi-LSTM has 1.82, DNN-C has 1.75 and GCN has 1.68 respectively. The PESQ of the proposed method is 1.75%, 3.15%, and 4.22% better than the existing GCN, DNN-C, and Bi-LSTM techniques. The efficacy of the proposed method is then validated by experiments.
引用
收藏
页码:781 / 790
页数:10
相关论文
共 50 条
  • [11] Microphone array speech enhancement based on optimized IMCRA
    Li, Qiuying
    Zhang, Tao
    Geng, Yanzhang
    Gao, Zhen
    NOISE CONTROL ENGINEERING JOURNAL, 2021, 69 (06) : 468 - 476
  • [12] A Robust Speech Enhancement Method Based on Microphone Array
    Zhang, Qiquan
    Wang, Mingjiang
    Zhang, Lu
    2017 17TH IEEE INTERNATIONAL CONFERENCE ON COMMUNICATION TECHNOLOGY (ICCT 2017), 2017, : 1673 - 1678
  • [13] Adaptive microphone array for speech enhancement
    Andres Cadavid, William
    Alexander Penafiel, Jhon
    Ivan Marin, Jorge
    REVISTA DE INVESTIGACIONES-UNIVERSIDAD DEL QUINDIO, 2009, 19 : 71 - 78
  • [14] DEEP LEARNING BASED SPEECH BEAMFORMING
    Qian, Kaizhi
    Zhang, Yang
    Chang, Shiyu
    Yang, Xuesong
    Florencio, Dinei
    Hasegawa-Johnson, Mark
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5389 - 5393
  • [15] An intelligent microphone array for speech enhancement
    Cao, YC
    Sridharan, S
    Moody, M
    ISSPA 96 - FOURTH INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND ITS APPLICATIONS, PROCEEDINGS, VOLS 1 AND 2, 1996, : 391 - 394
  • [16] Speech Enhancement for Optical Laser Microphone With Deep Neural Network
    Cai, Chengkai
    Iwai, Kenta
    Nishiura, Takanobu
    Yamashita, Yoichi
    2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2020, : 449 - 454
  • [17] Microphone array beamforming approach to blind speech separation
    Himawan, Ivan
    McCowan, Iain
    Lincoln, Mike
    MACHINE LEARNING FOR MULTIMODAL INTERACTION, 2008, 4892 : 295 - +
  • [18] Antenna Array Beamforming Based on Deep Learning Neural Network Architectures
    Al Kassir, Haya
    Zaharis, Zaharias D.
    Lazaridis, Pavlos, I
    Kantartzis, Nikolaos, V
    Yioultsis, Traianos, V
    Chochliouros, Ioannis P.
    Mihovska, Albena
    Xenos, Thomas D.
    2022 3RD URSI ATLANTIC AND ASIA PACIFIC RADIO SCIENCE MEETING (AT-AP-RASC), 2022,
  • [19] Microphone Array Speech Enhancement Based on Tensor Filtering Methods
    Jing Wang
    Xiang Xie
    Jingming Kuang
    中国通信, 2018, 15 (04) : 141 - 152
  • [20] Digital Hearing Aids Speech Enhancement Based on Microphone Array
    Wang, Chong
    2015 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING APPLICATIONS (CSEA 2015), 2015, : 546 - 550