Microphone Array Processing for Distant Speech Recognition: Spherical Arrays

被引:0
|
作者
McDonough, John [1 ]
Kumatani, Kenichi [2 ]
Raj, Bhiksha [3 ]
机构
[1] Carnegie Mellon Univ, Voci Technol Inc, Pittsburgh, PA 15213 USA
[2] Disney Res, Pittsburgh, PA USA
[3] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
关键词
DESIGN;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Distant speech recognition (DSR) holds out the promise of the most natural human computer interface because it enables man-machine interactions through speech, without the necessity of donning intrusive body- or head-mounted microphones. With the advent of the Microsoft Kinect, the application of non-uniform linear arrays to the DSR problem has become commonplace. Performance analysis of such arrays is well-represented in the literature. Recently, spherical arrays have become the subject of intense research interest in the acoustic array processing community. Such arrays have heretofore been analyzed solely with theoretical metrics under idealized conditions. In this work, we analyze such arrays under realistic conditions. Moreover, we compare a linear array with 64-channel arrays and a total length of 126 cm to a spherical array with 32 channels and a radius of 4.2 cm; we found that these provided word error rates of 9.3% and 10.2%, respectively, on a DSR task. For a speaker positioned at an oblique angle with respect to the linear array, we recorded error rates of 12.8% and 9.7%, respectively, for the linear and spherical arrays. The compact size and outstanding performance of the spherical array recommends itself well to space-limited and mobile applications such as home-gaming consoles and humanoid robots.
引用
收藏
页数:10
相关论文
共 50 条
  • [21] Adaptive Microphone Array Processing for High-Performance Speech Recognition in Car Environment
    Hong, Jungpyo
    Han, Seungho
    Jeong, Sangbae
    Hahn, Minsoo
    IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE 2011), 2011, : 829 - +
  • [22] Adaptive Microphone Array Processing for High-Performance Speech Recognition in Car Environment
    Hong, Jungpyo
    Han, Seungho
    Jeong, Sangbae
    Hahn, Minsoo
    IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2011, 57 (01) : 260 - 266
  • [23] SPEECH RECOGNITION IN NOISY ENVIRONMENTS WITH THE AID OF MICROPHONE ARRAYS
    VANCOMPERNOLLE, D
    MA, W
    XIE, F
    VANDIEST, M
    SPEECH COMMUNICATION, 1990, 9 (5-6) : 433 - 442
  • [24] A Posterior Approach for Microphone Array Based Speech Recognition
    Wang, Dong
    Himawan, Ivan
    Frankel, Joe
    King, Simon
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 996 - 999
  • [25] Joint Training of Multi-channel-condition Dereverberation and Acoustic Modeling of Microphone Array Speech for Robust Distant Speech Recognition
    Ge, Fengpei
    Li, Kehuang
    Wu, Bo
    Siniscalchi, Sabato Marco
    Yan, Yonghong
    Lee, Chin-Hui
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3847 - 3851
  • [26] Microphone array sub-band speech recognition
    McCowan, IA
    Sridharan, S
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 185 - 188
  • [27] Robust speech recognition with speaker localization by a microphone array
    Yamada, T
    Nakamura, S
    Shikano, K
    ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1317 - 1320
  • [28] Modern microphone array for hearing aid and speech processing
    Wang, A
    Yao, K
    Hudson, RE
    Korompis, D
    Lorenzelli, F
    Soli, SD
    Gao, S
    ADVANCED SIGNAL PROCESSING ALGORITHMS, ARCHITECTURES, AND IMPLEMENTATIONS VI, 1996, 2846 : 112 - 121
  • [29] Distant-talking speech recognition with microphone-array sound pickup and NN/MLLR environment equalization
    Lin, QG
    Flanagan, J
    Che, CW
    PROGRESS IN CONNECTIONIST-BASED INFORMATION SYSTEMS, VOLS 1 AND 2, 1998, : 1099 - 1102
  • [30] Microphone Array Post-filter based on Spatially-Correlated Noise Measurements for Distant Speech Recognition
    Kumatani, Kenichi
    Raj, Bhiksha
    Singh, Rita
    McDonough, John
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 298 - 301