Robust speech recognition with on-line unsupervised acoustic feature compensation

被引：0

作者：

Buera, Luis ^{[1
]}

Miguel, Antonio ^{[1
]}

Lleida, Eduardo ^{[1
]}

Saz, Oscar ^{[1
]}

Ortega, Alfonso ^{[1
]}

机构：

[1] Univ Zaragoza, GTC, E-50009 Zaragoza, Spain

来源：

2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2 | 2007年

关键词：

robust speech recognition; feature vector normalization; acoustic model adaptation;

D O I：

10.1109/ASRU.2007.4430092

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

An on-line unsupervised hybrid compensation technique is proposed to reduce the mismatch between training and testing conditions. It combines Multi-Environment Model based LInear Normalization with cross-probability model based on GMMs (MEMLIN CPM) with a novel acoustic model adaptation method based on rotation transformations. Hence, a set of rotation transformations is estimated with clean and MEMLIN CPM-normalized training data by linear regression in an unsupervised process. Thus, in testing, each MEMLIN CPM normalized frame is decoded using a modified Viterbi algorithm and expanded acoustic models, which are obtained from the reference ones and the set of rotation transformations. To test the proposed solution, some experiments with Spanish SpeechDat Car database were carried out. MEMLIN CPM over standard ETSI front-end parameters reaches 83.89% of average improvement in WER, while the introduced hybrid solution goes up to 92.07%. Also, the proposed hybrid technique was tested with Aurora 2 database, obtaining an average improvement of 68.88% with clean training.

引用

页码：105 / 110

页数：6

共 50 条

[21] Speech emotion recognition with unsupervised feature learning
Zheng-wei HUANG
Wen-tao XUE
Qi-rong MAO
Frontiers of Information Technology & Electronic Engineering, 2015, 16 (05) : 358 - 366
[22] Adaptive compensation for robust speech recognition
Lee, CH
1997 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, PROCEEDINGS, 1997, : 357 - 364
[23] Acoustic Feature Optimization Based on F-Ratio for Robust Speech Recognition
Sun, Yanqing
Zhou, Yu
Zhao, Qingwei
Yan, Yonghong
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2010, E93D (09): : 2417 - 2430
[24] Compensation of speech enhancement distortion for robust speech recognition
Ding, P
Cao, ZG
2002 IEEE REGION 10 CONFERENCE ON COMPUTERS, COMMUNICATIONS, CONTROL AND POWER ENGINEERING, VOLS I-III, PROCEEDINGS, 2002, : 449 - 452
[25] Feature extraction for robust speech recognition
Dharanipragada, S
2002 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOL II, PROCEEDINGS, 2002, : 855 - 858
[26] Noise Robust Speech Recognition Based on Noise-Adapted HMMs Using Speech Feature Compensation
Chung, Yong-Joo
2013 INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER SCIENCE APPLICATIONS AND TECHNOLOGIES (ACSAT), 2014, : 132 - 135
[27] Speech Feature Compensation Based on Pseudo Stereo Codebooks for Robust Speech Recognition in Additive Noise Environments
Hsieh, Tsung-hsueh
Hung, Jeih-weih
INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2400 - 2403
[28] Efficient on-line acoustic environment estimation for FCDCN in a continuous speech recognition system
Droppo, J
Acero, A
Deng, L
2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 209 - 212
[29] Front-end Feature Compensation and Denoising for Noise Robust Speech Emotion Recognition
Chakraborty, Rupayan
Panda, Ashish
Pandharipande, Meghna
Joshi, Sonal
Kopparapu, Sunil Kumar
INTERSPEECH 2019, 2019, : 3257 - 3261
[30] Feature compensation employing multiple environmental models for robust in-vehicle speech recognition
Kim, Wooil
Hansen, John H. L.
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2008, E91D (03) : 430 - 438

← 1 2 3 4 5 →