Microphone Array Processing for Distant Speech Recognition: Spherical Arrays

被引：0

作者：

McDonough, John ^{[1
]}

Kumatani, Kenichi ^{[2
]}

Raj, Bhiksha ^{[3
]}

机构：

[1] Carnegie Mellon Univ, Voci Technol Inc, Pittsburgh, PA 15213 USA

[2] Disney Res, Pittsburgh, PA USA

[3] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA

来源：

2012 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC) | 2012年

关键词：

DESIGN;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Distant speech recognition (DSR) holds out the promise of the most natural human computer interface because it enables man-machine interactions through speech, without the necessity of donning intrusive body- or head-mounted microphones. With the advent of the Microsoft Kinect, the application of non-uniform linear arrays to the DSR problem has become commonplace. Performance analysis of such arrays is well-represented in the literature. Recently, spherical arrays have become the subject of intense research interest in the acoustic array processing community. Such arrays have heretofore been analyzed solely with theoretical metrics under idealized conditions. In this work, we analyze such arrays under realistic conditions. Moreover, we compare a linear array with 64-channel arrays and a total length of 126 cm to a spherical array with 32 channels and a radius of 4.2 cm; we found that these provided word error rates of 9.3% and 10.2%, respectively, on a DSR task. For a speaker positioned at an oblique angle with respect to the linear array, we recorded error rates of 12.8% and 9.7%, respectively, for the linear and spherical arrays. The compact size and outstanding performance of the spherical array recommends itself well to space-limited and mobile applications such as home-gaming consoles and humanoid robots.

引用

页数：10

共 50 条

[41] Processing of speech signals using a microphone array for intelligent robots
Hu, I
Cheng, CC
Liu, WH
PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART I-JOURNAL OF SYSTEMS AND CONTROL ENGINEERING, 2005, 219 (I2) : 133 - 143
[42] A microphone array processing technique for speech enhancement in a reverberant space
Liu, QG
Champagne, B
Kabal, P
SPEECH COMMUNICATION, 1996, 18 (04) : 317 - 334
[43] Two-channel microphone array processing for speech enhancement
Yan, ZL
Du, LM
Wei, JQ
Zeng, H
PROCEEDINGS OF THE 2003 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOL II: COMMUNICATIONS-MULTIMEDIA SYSTEMS & APPLICATIONS, 2003, : 548 - 551
[44] A signal subspace tracking algorithm for microphone array processing of speech
Affes, S
Grenier, Y
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1997, 5 (05): : 425 - 437
[45] Calibration, optimization, and DSP implementation of microphone array for speech processing
Wang, A
Yao, K
Hudson, RE
Korompis, D
Lorenzelli, F
Soli, SD
Gao, S
VLSI SIGNAL PROCESSING, IX, 1996, : 221 - 230
[46] Recurrent Models for Auditory Attention in Multi-Microphone Distant Speech Recognition
Kim, Suyoun
Lane, Ian
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3838 - 3842
[47] SPEAKER IDENTIFICATION WITH DISTANT MICROPHONE SPEECH
Jin, Qin
Li, Runxin
Yang, Qian
Laskowski, Kornel
Schultz, Tanja
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4518 - 4521
[48] EXPLOITING INTER-MICROPHONE AGREEMENT FOR HYPOTHESIS COMBINATION IN DISTANT SPEECH RECOGNITION
Guerrero, Cristina
Omologo, Maurizio
2014 PROCEEDINGS OF THE 22ND EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2014, : 2385 - 2389
[49] Subband parameter optimization of microphone arrays for speech recognition in reverberant environments
Seltzer, ML
Stern, RM
2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 408 - 411
[50] Spatio-temporal processing for distant speech recognition
Low, SY
Togneri, R
Nordholm, S
2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 1001 - 1004

← 1 2 3 4 5 →