Speech Enhancement Using Dynamical Variational AutoEncoder

被引：0

作者：

Do, Hao D. ^{[1
]}

机构：

[1] FPT Univ, Ho Chi Minh City, Vietnam

来源：

INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2023, PT II | 2023年 / 13996卷

关键词：

speech enhancement; dynamical variational autoEncoder; generative model;

D O I：

10.1007/978-981-99-5837-5_21

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This research focuses on dealing with speech enhancement via a generative model. Many other solutions, which are trained with some fixed kinds of interference or noises, need help when extracting speech from the mixture with a strange noise. We use a class of generative models called Dynamical Variational AutoEncoder (DVAE), which combines generative and temporal models to analyze the speech signal. This class of models makes attention to speech signal behavior, then extracts and enhances the speech. Moreover, we design a new architecture in the DVAE class named Bi-RVAE, which is more straightforward than the other models but gains good results. Experimental results show that DVAE class, including our proposed design, achieves a high-quality recovered speech. This class could enhance the speech signal before passing it into the central processing models.

引用

页码：247 / 258

页数：12

共 50 条

[31] Learning and controlling the source-filter representation of speech with a variational autoencoder
Sadok, Samir
Leglaive, Simon
Girin, Laurent
Alameda-Pineda, Xavier
Seguier, Renaud
SPEECH COMMUNICATION, 2023, 148 : 53 - 65
[32] Variational Bayesian learning for speech modeling and enhancement
Huang, Qinghua
Yang, Jie
Wei, Shoushui
SIGNAL PROCESSING, 2007, 87 (09) : 2026 - 2035
[33] A robust variational autoencoder using beta divergence
Akrami, Haleh
Joshi, Anand A.
Li, Jian
Aydore, Sergul
Leahy, Richard M.
KNOWLEDGE-BASED SYSTEMS, 2022, 238
[34] Speaker normalization using Joint Variational Autoencoder
Kumar, Shashi
Rath, Shakti P.
Pandey, Abhishek
INTERSPEECH 2021, 2021, : 1289 - 1293
[35] Botnet Detection Using Recurrent Variational Autoencoder
Kim, Jeeyung
Sim, Alex
Kim, Jinoh
Wu, Kesheng
2020 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2020,
[36] Crash data augmentation using variational autoencoder
Islam, Zubayer
Abdel-Aty, Mohamed
Cai, Qing
Yuan, Jinghui
ACCIDENT ANALYSIS AND PREVENTION, 2021, 151
[37] A Statistically Principled and Computationally Efficient Approach to Speech Enhancement using Variational Autoencoders
Pariente, Manuel
Deleforge, Antoine
Vincent, Emmanuel
INTERSPEECH 2019, 2019, : 3158 - 3162
[38] DYNAMIC AUDIO-VISUAL SPEECH ENHANCEMENT USING RECURRENT VARIATIONAL AUTOENCODERS
Foroushi, Z.
Dansereau, R. M.
2024 18TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT, IWAENC 2024, 2024, : 60 - 64
[39] Deep Denoising Autoencoder Based Post Filtering for Speech Enhancement
Zezario, Ryandhimas E.
Huang, Jen-Wei
Lu, Xugang
Tsao, Yu
Hwang, Hsin-Te
Wang, Hsin-Min
2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 373 - 377
[40] MIMO Speech Compression and Enhancement Based on Convolutional Denoising Autoencoder
Li, You-Jin
Wang, Syu-Siang
Tsao, Yu
Su, Borching
2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2021, : 1245 - 1250

← 1 2 3 4 5 →