Parameter Tuning-Free Missing-Feature Reconstruction for Robust Sound Recognition

被引：4

作者：

Liu, Qi ^{[1
]}

Wu, Jibin ^{[1
]}

机构：

[1] Natl Univ Singapore, Dept Elect & Comp Engn, Singapore 119077, Singapore

来源：

IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING | 2021年 / 15卷 / 01期

基金：

新加坡国家研究基金会;

关键词：

Spectrogram; Matrix decomposition; Acoustics; Task analysis; Computational modeling; Speech recognition; Tuning; Missing-feature reconstruction; matrix factorization; deep neural networks (DNNs); automatic speech recognition (ASR); environmental sound classification; AUTOMATIC SPEECH RECOGNITION; MATRIX COMPLETION; FEATURE-EXTRACTION; ALGORITHM; RECOVERY; OPTIMIZATION;

D O I：

10.1109/JSTSP.2020.3038054

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

With the advent of the deep neural network, automatic speech recognition (ASR) has seen significant improvements in recent years. However, ASR performance degrades rapidly when the acoustic environment, such as communication channels or noise backgrounds, differ from those of training data. In the missing feature approach to speech processing, the unreliable feature components are identified and reconstructed to overcome signal degradation and the mismatch of the acoustic environment. To reduce the model dependency, we investigate the matrix completion technique in missing feature reconstruction tasks. However, most of the matrix completion techniques require a priori tuning parameters, e.g., target rank, which is hard to determine in practice. In this work, we propose a matrix completion method based on matrix factorization for the missing-feature reconstruction task, that does not require model training nor parameter tuning. Experiments show superior feature reconstruction performance and computational efficiency in both speech recognition and environmental sound classification tasks.

引用

页码：78 / 89

页数：12

共 50 条

[1] Mask classification for missing-feature reconstruction for robust speech recognition in unknown background noise
Kim, Wooil
Stern, Richard M.
SPEECH COMMUNICATION, 2011, 53 (01) : 1 - 11
[2] MMSE-Based Missing-Feature Reconstruction With Temporal Modeling for Robust Speech Recognition
Gonzalez, Jose A.
Peinado, Antonio M.
Ma, Ning
Gomez, Angel M.
Barker, Jon
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (03): : 624 - 635
[3] Missing-feature approaches in speech recognition
Raj, B
Stern, RM
IEEE SIGNAL PROCESSING MAGAZINE, 2005, 22 (05) : 101 - 116
[4] Missing-Feature Reconstruction by Leveraging Temporal Spectral Correlation for Robust Speech Recognition in Background Noise Conditions
Kim, Wooil
Hansen, John H. L.
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (08): : 2111 - 2120
[5] Adaptive Speech Model for Missing-Feature Reconstruction
Viana, Hesdras O.
Araujo, Aluizio F. R.
2016 IEEE INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP), 2016, : 104 - 108
[6] Missing-Feature Reconstruction for Band-Limited Speech Recognition in Spoken Document Retrieval
Kim, Wooil
Hansen, John H. L.
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2306 - 2309
[7] Time-Frequency Correlation-Based Missing-Feature Reconstruction for Robust Speech Recognition in Band-Restricted Conditions
Kim, Wooil
Hansen, John H. L.
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2009, 17 (07): : 1292 - 1304
[8] Robust convex biclustering with a tuning-free method
Chen, Yifan
Lei, Chunyin
Li, Chuanquan
Ma, Haiqiang
Hu, Ningyuan
JOURNAL OF APPLIED STATISTICS, 2025, 52 (02) : 271 - 286
[9] Missing-Feature Reconstruction With a Bounded Nonlinear State-Space Model
Remes, Ulpu
Palomaki, Kalle J.
Raiko, Tapani
Honkela, Antti
Kurimo, Mikko
IEEE SIGNAL PROCESSING LETTERS, 2011, 18 (10) : 563 - 566
[10] Missing-Feature Method for Speaker Recognition in Band-Restricted Conditions
Kim, Wooil
Hansen, John H. L.
INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1909 - 1912

← 1 2 3 4 5 →