Speech Enhancement Using Dynamical Variational AutoEncoder

被引：0

作者：

Do, Hao D. ^{[1
]}

机构：

[1] FPT Univ, Ho Chi Minh City, Vietnam

来源：

INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2023, PT II | 2023年 / 13996卷

关键词：

speech enhancement; dynamical variational autoEncoder; generative model;

D O I：

10.1007/978-981-99-5837-5_21

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This research focuses on dealing with speech enhancement via a generative model. Many other solutions, which are trained with some fixed kinds of interference or noises, need help when extracting speech from the mixture with a strange noise. We use a class of generative models called Dynamical Variational AutoEncoder (DVAE), which combines generative and temporal models to analyze the speech signal. This class of models makes attention to speech signal behavior, then extracts and enhances the speech. Moreover, we design a new architecture in the DVAE class named Bi-RVAE, which is more straightforward than the other models but gains good results. Experimental results show that DVAE class, including our proposed design, achieves a high-quality recovered speech. This class could enhance the speech signal before passing it into the central processing models.

引用

页码：247 / 258

页数：12

共 50 条

[41] Multi-channel Speech Enhancement Using Time-Domain Convolutional Denoising Autoencoder
Tawara, Naohiro
Kobayashi, Tetsunori
Ogawa, Tetsuji
INTERSPEECH 2019, 2019, : 86 - 90
[42] Insider Threat Detection using Deep Autoencoder and Variational Autoencoder Neural Networks
Pantelidis, Efthimios
Bendiab, Gueltoum
Shiaeles, Stavros
Kolokotronis, Nicholas
PROCEEDINGS OF THE 2021 IEEE INTERNATIONAL CONFERENCE ON CYBER SECURITY AND RESILIENCE (IEEE CSR), 2021, : 129 - 134
[43] RETRACTED: Modeling neural dynamics during speech production using a state space variational autoencoder (Retracted Article)
Sun, Pengfei
Moses, David A.
Chang, Edward F.
2019 9TH INTERNATIONAL IEEE/EMBS CONFERENCE ON NEURAL ENGINEERING (NER), 2019, : 428 - 432
[44] Multichannel Variational Autoencoder-Based Speech Separation in Designated Speaker Order
Liao, Lele
Cheng, Guoliang
Ruan, Haoxin
Chen, Kai
Lu, Jing
SYMMETRY-BASEL, 2022, 14 (12):
[45] Unsupervised speech enhancement with deep dynamical generative speech and noise models
Lin, Xiaoyu
Leglaive, Simon
Girin, Laurent
Alameda-Pineda, Xavier
INTERSPEECH 2023, 2023, : 5102 - 5106
[46] A Benchmark of Dynamical Variational Autoencoders applied to Speech Spectrogram Modeling
Bie, Xiaoyu
Girin, Laurent
Leglaive, Simon
Hueber, Thomas
Alameda-Pineda, Xavier
INTERSPEECH 2021, 2021, : 46 - 50
[47] ROBUST UNSUPERVISED AUDIO-VISUAL SPEECH ENHANCEMENT USING A MIXTURE OF VARIATIONAL AUTOENCODERS
Sadeghi, Mostafa
Alameda-Pineda, Xavier
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7534 - 7538
[48] Audio-Visual Speech Enhancement Using Conditional Variational Auto-Encoders
Sadeghi, Mostafa
Leglaive, Simon
Alameda-Pineda, Xavier
Girin, Laurent
Horaud, Radu
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 1788 - 1800
[49] Dimension Reduction on Open Data using Variational Autoencoder
Lee, Hyunmin
Wu, Zhen Hao
Zhang, Zhaolei
2018 18TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW), 2018, : 1080 - 1085
[50] FVAE: a regularized variational autoencoder using the Fisher criterion
Lai, Jie
Wang, Xiaodan
Xiang, Qian
Li, Rui
Song, Yafei
APPLIED INTELLIGENCE, 2022, 52 (14) : 16869 - 16885

← 1 2 3 4 5 →