Implementation of Real-Time Speech Separation Model Using Time-Domain Audio Separation Network (TasNet) and Dual-Path Recurrent Neural Network (DPRNN)

被引：3

作者：

Wijayakusuma, Alfian ^{[1
]}

Gozali, Davin Reinaldo ^{[1
]}

Widjaja, Anthony ^{[1
]}

Ham, Hanry ^{[1
]}

机构：

[1] Bina Nusantara Univ, Sch Comp Sci, Comp Sci Dept, Jakarta 11480, Indonesia

来源：

5TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND COMPUTATIONAL INTELLIGENCE 2020 | 2021年 / 179卷

关键词：

Speech Separation; Time-Domain; Time-Domain Audio Separation Network; Dual-Path Recurrent Neural Network; Real-Time;

D O I：

10.1016/j.procs.2021.01.065

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The purpose of this research is to develop a model that is able to perform real-time speaker independent multi-talker speech separation task in time-domain using Time-Domain Audio Separation Network (TasNet) and Dual-Path Recurrent Neural Network (DPRNN). This research will conduct experiments on some RNN architectures, number of batch size, and optimizers as hyperparameters in order to implement TasNet and DPRNN. This research also try to analyze the impact of these hyperparameters setup on model performance. The expected result of this research is a more accurate model and lower latency to complete speaker independent multi-talker speech separation task in real-time than previous research model. (C) 2021 The Authors. Published by Elsevier B.V.

引用

页码：762 / 772

页数：11

共 50 条

[21] Time-domain adaptive attention network for single-channel speech separation
Kunpeng Wang
Hao Zhou
Jingxiang Cai
Wenna Li
Juan Yao
EURASIP Journal on Audio, Speech, and Music Processing, 2023
[22] Time-domain adaptive attention network for single-channel speech separation
Wang, Kunpeng
Zhou, Hao
Cai, Jingxiang
Li, Wenna
Yao, Juan
EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2023, 2023 (01)
[23] Audio Source Separation from a Monaural Mixture Using Convolutional Neural Network in the Time Domain
Zhang, Peng
Ma, Xiaohong
Ding, Shuxue
ADVANCES IN NEURAL NETWORKS, PT II, 2017, 10262 : 388 - 395
[24] Overlapped spectral demodulation of fiber Bragg grating using convolutional time-domain audio separation network
Shan, Linlin
Yu, Mingxin
Xia, Jiabin
Xin, Jingtao
Deng, Chaofan
Zhu, Lianqing
OPTICAL ENGINEERING, 2023, 62 (06)
[25] Real-Time Speech Enhancement Based on Convolutional Recurrent Neural Network
Girirajan, S.
Pandian, A.
INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2023, 35 (02): : 1987 - 2001
[26] Real-Time Thai Speech Emotion Recognition With Speech Enhancement Using Time-Domain Contrastive Predictive Coding and Conv-Tasnet
Yuenyong, Sumeth
Hnoohom, Narit
Wongpatikaseree, Konlakorn
Singkul, Sattaya
2022 7TH INTERNATIONAL CONFERENCE ON BUSINESS AND INDUSTRIAL RESEARCH (ICBIR2022), 2022, : 78 - 83
[27] Embedding Recurrent Layers with Dual-Path Strategy in a Variant of Convolutional Network for Speaker-Independent Speech Separation
Yang, Xue
Bao, Changchun
INTERSPEECH 2022, 2022, : 5338 - 5342
[28] A new time-domain detection approach with blind separation based on neural network
Li, Guan Nan
Zheng, Feng
Yang, Li
Zhang, Ning
PROCEEDINGS OF THE SECOND INTERNATIONAL SYMPOSIUM ON ELECTRONIC COMMERCE AND SECURITY, VOL I, 2009, : 565 - +
[29] TCNN: TEMPORAL CONVOLUTIONAL NEURAL NETWORK FOR REAL-TIME SPEECH ENHANCEMENT IN THE TIME DOMAIN
Pandey, Ashutosh
Wang, DeLiang
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6875 - 6879
[30] An Implementation of Real-Time Audio Monitoring in Network Camera
Yuan, Xuehao
Zhang, Yumeng
Li, Hui
MULTIMEDIA AND SIGNAL PROCESSING, 2012, 346 : 412 - 419

← 1 2 3 4 5 →