SCORE NORMALIZATION AND SYSTEM COMBINATION FOR IMPROVED KEYWORD SPOTTING

被引:0
|
作者
Karakos, Damianos [1 ]
Schwartz, Richard [1 ]
Tsakalidis, Stavros [1 ]
Zhang, Le [1 ]
Ranjan, Shivesh [1 ]
Ng, Tim [1 ]
Hsiao, Roger [1 ]
Saikumar, Guruprasad [1 ]
Bulyko, Ivan [1 ]
Long Nguyen [1 ]
Makhoul, John [1 ]
Grezl, Frantisek [2 ]
Hannemann, Mirko [2 ]
Karafiat, Martin [2 ]
Szoke, Igor [2 ]
Vesely, Karel [2 ]
Lamel, Lori [3 ]
Le, Viet-Bac [4 ]
机构
[1] Raytheon BBN Technol, Cambridge, MA 02138 USA
[2] Brno Univ Technol, SpeechFIT, Brno, Czech Republic
[3] CNRS LIMSI, Paris, France
[4] Vocapia Res, Paris, France
关键词
keyword search; score normalization; system combination; indexing and search;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present two techniques that are shown to yield improved Keyword Spotting (KWS) performance when using the ATWV/MTWV performance measures: (i) score normalization, where the scores of different keywords become commensurate with each other and they more closely correspond to the probability of being correct than raw posteriors; and (ii) system combination, where the detections of multiple systems are merged together, and their scores are interpolated with weights which are optimized using MTWV as the maximization criterion. Both score normalization and system combination approaches show that significant gains in ATWV/MTWV can be obtained, sometimes on the order of 8-10 points (absolute), in five different languages. A variant of these methods resulted in the highest performance for the official surprise language evaluation for the IARPA-funded Babel project in April 2013.
引用
收藏
页码:210 / 215
页数:6
相关论文
共 50 条
  • [41] THE 2013 BBN VIETNAMESE TELEPHONE SPEECH KEYWORD SPOTTING SYSTEM
    Tsakalidis, Stavros
    Hsiao, Roger
    Karakos, Damianos
    Ng, Tim
    Ranjan, Shivesh
    Saikumar, Guruprasad
    Zhang, Le
    Nguyen, Long
    Schwartz, Richard
    Makhoul, John
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [42] Improving the performance of a keyword spotting system by using support vector
    Benayed, Y
    Fohr, D
    Haton, JP
    Chollet, G
    ASRU'03: 2003 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING ASRU '03, 2003, : 145 - 149
  • [43] Improved External Speaker-Robust Keyword Spotting for Hearing Assistive Devices
    Lopez-Espejo, Ivan
    Tan, Zheng-Hua
    Jensen, Jesper
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 1233 - 1247
  • [44] Keyword Spotting with Quaternionic ResNet: Application to Spotting in Greek Manuscripts
    Sfikas, Giorgos
    Retsinas, George
    Giotis, Angelos P.
    Gatos, Basilis
    Nikou, Christophoros
    DOCUMENT ANALYSIS SYSTEMS, DAS 2022, 2022, 13237 : 382 - 396
  • [45] COMBINATION OF SEARCH TECHNIQUES FOR IMPROVED SPOTTING OF OOV KEYWORDS
    Karakos, Damianos
    Schwartz, Richard M.
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5336 - 5340
  • [46] A multimodel keyword spotting system based on lip movement and speech features
    Handa, Anand
    Agarwal, Rashi
    Kohli, Narendra
    MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (27-28) : 20461 - 20481
  • [47] A depthwise separable convolutional neural network for keyword spotting on an embedded system
    Peter Mølgaard Sørensen
    Bastian Epp
    Tobias May
    EURASIP Journal on Audio, Speech, and Music Processing, 2020
  • [48] A depthwise separable convolutional neural network for keyword spotting on an embedded system
    Sorensen, Peter Molgaard
    Epp, Bastian
    May, Tobias
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2020, 2020 (01)
  • [49] A multimodel keyword spotting system based on lip movement and speech features
    Anand Handa
    Rashi Agarwal
    Narendra Kohli
    Multimedia Tools and Applications, 2020, 79 : 20461 - 20481
  • [50] A novel spoken keyword spotting system using support vector machine
    Sangeetha, J.
    Jothilakshmi, S.
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2014, 36 : 287 - 293