Full single-type deep learning models with multihead attention for speech enhancement

被引:2
|
作者
Zacarias-Morales, Noel [1 ]
Hernandez-Nolasco, Jose Adan [1 ]
Pancardo, Pablo [1 ]
机构
[1] Juarez Autonomous Univ Tabasco, Acad Div Sci & Informat Technol, Cunduacan 86690, Tabasco, Mexico
关键词
Artificial neural network; Attention; Deep learning models; Speech enhancement; SELF-ATTENTION; ALGORITHM; NOISE;
D O I
10.1007/s10489-023-04571-y
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Artificial neural network (ANN) models with attention mechanisms for eliminating noise in audio signals, called speech enhancement models, have proven effective. However, their architectures become complex, deep, and demanding in terms of computational resources when trying to achieve higher levels of efficiency. Given this situation, we selected and evaluated simple and less resource-demanding models and utilized the same training parameters and performance metrics to conduct a fair comparison among the four selected models. Our purpose was to demonstrate that simple neural network models with multihead attention are efficient when implemented on computational devices with conventional resources since they provide results that are competitive with those of hybrid, complex and resource-demanding models. We experimentally evaluated the efficiency of multilayer perceptron (MLP), one-dimensional and two-dimensional convolutional neural network (CNN), and gated recurrent unit (GRU) deep learning models with and without multiheaded attention. We also analyzed the generalization capability of each model. The results showed that although these architectures were composed of only one type of ANN, multihead attention increased the efficiency of the speech enhancement process, yielding results that were competitive with those of complex models. Therefore, this study is helpful as a reference for building simple and efficient single-type ANN models with attention.
引用
收藏
页码:20561 / 20576
页数:16
相关论文
共 50 条
  • [1] Full single-type deep learning models with multihead attention for speech enhancement
    Noel Zacarias-Morales
    José Adán Hernández-Nolasco
    Pablo Pancardo
    Applied Intelligence, 2023, 53 : 20561 - 20576
  • [2] Medical Text Classification Using Hybrid Deep Learning Models with Multihead Attention
    Prabhakar, Sunil Kumar
    Won, Dong-Ok
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2021, 2021
  • [3] Deep Learning Models for Single-Channel Speech Enhancement on Drones
    Mukhutdinov, Dmitrii
    Alex, Ashish
    Cavallaro, Andrea
    Wang, Lin
    IEEE ACCESS, 2023, 11 : 22993 - 23007
  • [4] Explaining deep learning models for speech enhancement
    Sivasankaran, Sunit
    Vincent, Emmanuel
    Fohr, Dominique
    INTERSPEECH 2021, 2021, : 696 - 700
  • [5] Binaural speech enhancement algorithm based on attention and deep learning
    Li R.
    Li Q.
    Zhao F.
    Liu S.
    Huazhong Keji Daxue Xuebao (Ziran Kexue Ban)/Journal of Huazhong University of Science and Technology (Natural Science Edition), 2023, 51 (09): : 125 - 131and166
  • [6] Deep Time Delay Neural Network for Speech Enhancement with Full Data Learning
    Fan, Cunhang
    Liu, Bin
    Tao, Jianhua
    Yi, Jiangyan
    Wen, Zhengqi
    Song, Leichao
    2021 12TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2021,
  • [7] Speech enhancement with noise estimation and filtration using deep learning models
    Kantamaneni, Sravanthi
    Charles, A.
    Babu, T. Ranga
    THEORETICAL COMPUTER SCIENCE, 2023, 941 : 14 - 28
  • [8] Voice disorder classification using speech enhancement and deep learning models
    Chaiani, Mounira
    Selouani, Sid Ahmed
    Boudraa, Malika
    Yakoub, Mohammed Sidi
    BIOCYBERNETICS AND BIOMEDICAL ENGINEERING, 2022, 42 (02) : 463 - 480
  • [9] Speech enhancement with noise estimation and filtration using deep learning models
    Kantamaneni, Sravanthi
    Charles, A.
    Babu, T. Ranga
    THEORETICAL COMPUTER SCIENCE, 2023, 941 : 14 - 28
  • [10] Local spectral attention for full-band speech enhancement
    Hou, Zhongshu
    Hu, Qinwen
    Chen, Kai
    Cao, Zhanzhong
    Lu, Jing
    JASA EXPRESS LETTERS, 2023, 3 (11):