Microphone Array Processing for Distant Speech Recognition: Spherical Arrays

被引:0
|
作者
McDonough, John [1 ]
Kumatani, Kenichi [2 ]
Raj, Bhiksha [3 ]
机构
[1] Carnegie Mellon Univ, Voci Technol Inc, Pittsburgh, PA 15213 USA
[2] Disney Res, Pittsburgh, PA USA
[3] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
关键词
DESIGN;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Distant speech recognition (DSR) holds out the promise of the most natural human computer interface because it enables man-machine interactions through speech, without the necessity of donning intrusive body- or head-mounted microphones. With the advent of the Microsoft Kinect, the application of non-uniform linear arrays to the DSR problem has become commonplace. Performance analysis of such arrays is well-represented in the literature. Recently, spherical arrays have become the subject of intense research interest in the acoustic array processing community. Such arrays have heretofore been analyzed solely with theoretical metrics under idealized conditions. In this work, we analyze such arrays under realistic conditions. Moreover, we compare a linear array with 64-channel arrays and a total length of 126 cm to a spherical array with 32 channels and a radius of 4.2 cm; we found that these provided word error rates of 9.3% and 10.2%, respectively, on a DSR task. For a speaker positioned at an oblique angle with respect to the linear array, we recorded error rates of 12.8% and 9.7%, respectively, for the linear and spherical arrays. The compact size and outstanding performance of the spherical array recommends itself well to space-limited and mobile applications such as home-gaming consoles and humanoid robots.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Microphone Array Processing for Distant Speech Recognition
    Kumatani, Kenichi
    McDonough, John
    Raj, Bhiksha
    IEEE SIGNAL PROCESSING MAGAZINE, 2012, 29 (06) : 127 - 140
  • [2] HMM adaptation and microphone array processing for distant speech recognition
    Kleban, J
    Gong, YF
    2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 1411 - 1414
  • [3] A DIGITAL MICROPHONE ARRAY FOR DISTANT SPEECH RECOGNITION
    Zwyssig, Erich
    Lincoln, Mike
    Renals, Steve
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 5106 - 5109
  • [4] Microphone Array Processing Strategies for Distant-Based Automatic Speech Recognition
    Khoubrouy, Soudeh A.
    Hansen, John H. L.
    IEEE SIGNAL PROCESSING LETTERS, 2016, 23 (10) : 1344 - 1348
  • [5] Distant Speech Recognition Using a Microphone Array Network
    Nakano, Alberto Yoshihiro
    Nakagawa, Seiichi
    Yamamoto, Kazumasa
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2010, E93D (09): : 2451 - 2462
  • [6] Microphone Array Processing for Distant Speech Recognition: Towards Real-World Deployment
    Kumatani, Kenichi
    Arakawa, Takayuki
    Yamamoto, Kazumasa
    McDonough, John
    Raj, Bhiksha
    Singh, Rita
    Tashev, Ivan
    2012 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2012,
  • [7] DISTANT SPEECH RECOGNITION IN REVERBERANT NOISY CONDITIONS EMPLOYING A MICROPHONE ARRAY
    Morales-Cordovilla, Juan A.
    Hagmueller, Martin
    Pessentheiner, Hannes
    Kubin, Gernot
    2014 PROCEEDINGS OF THE 22ND EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2014, : 2380 - 2384
  • [8] Microphone Array Speech Processing
    Nordholm, Sven
    Abhayapala, Thushara
    Doclo, Simon
    Gannot, Sharon
    Naylor, Patrick
    Tashev, Ivan
    EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2010,
  • [9] Robust Distant Speech Recognition by Combining Multiple Microphone-Array Processing with Position-Dependent CMN
    Longbiao Wang
    Norihide Kitaoka
    Seiichi Nakagawa
    EURASIP Journal on Advances in Signal Processing, 2006
  • [10] Microphone Array Speech Processing
    Sven Nordholm
    ThusharaD Abhayapala
    Simon Doclo
    Sharon Gannot
    P Naylor
    Ivan Tashev
    EURASIP Journal on Advances in Signal Processing, 2010