Zero-resource audio-only spoken term detection based on a combination of template matching techniques

被引:0
|
作者
Muscariello, Armando [1 ]
Gravier, Guillaume [1 ]
Bimbot, Frederic [1 ]
机构
[1] IRISA CNRS UMR 6074, Paris, France
来源
12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5 | 2011年
关键词
spoken term detection; template matching; unsupervised learning; posterior features;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Spoken term detection is a well-known information retrieval task that seeks to extract contentful information from audio by locating occurrences of known query words of interest. This paper describes a zero-resource approach to such task based on pattern matching of spoken term queries at the acoustic level. The template matching module comprises the cascade of a segmental variant of dynamic time warping and a self-similarity matrix comparison to further improve robustness to speech variability. This solution notably differs from more traditional train and test methods that, while shown to be very accurate, rely upon the availability of large amounts of linguistic resources. We evaluate our framework on different parameterizations of the speech templates: raw MFCC features and Gaussian posteriorgrams, French and English phonetic posteriorgrams output by two different state of the art phoneme recognizers.
引用
收藏
页码:928 / 931
页数:4
相关论文
共 25 条
  • [1] Simulating Zero-Resource Spoken Term Discovery
    White, Jerome
    Oard, Douglas W.
    CIKM'17: PROCEEDINGS OF THE 2017 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2017, : 2371 - 2374
  • [2] Spoken Term Detection of Zero-Resource Language using Machine Learning
    Ito, Akinori
    Koizumi, Masatoshi
    2018 INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATION TECHNOLOGY (ICIIT 2018), 2018, : 45 - 49
  • [3] Self-Paced Pattern Augmentation for Spoken Term Detection in Zero-Resource
    Sudhakar, P.
    Rao, Sreenivasa K.
    Mitra, Pabitra
    INTERSPEECH 2023, 2023, : 1618 - 1622
  • [4] ZERO-RESOURCE SPOKEN TERM DETECTION USING HIERARCHICAL GRAPH-BASED SIMILARITY SEARCH
    Aoyama, Kazuo
    Ogawa, Atsunori
    Hattori, Takashi
    Hori, Takaaki
    Nakamura, Atsushi
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [5] A Novel Zero-Resource Spoken Term Detection Using Affinity Kernel Propagation with Acoustic Feature Map
    Sudhakar P.
    Rao K.S.
    Mitra P.
    SN Computer Science, 4 (3)
  • [6] ZERO RESOURCE GRAPH-BASED CONFIDENCE ESTIMATION FOR OPEN VOCABULARY SPOKEN TERM DETECTION
    Norouzian, Atta
    Rose, Richard
    Ghalehjegh, Sina Hamidi
    Jansen, Aren
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 8292 - 8296
  • [7] A Fast Query-by-Example Spoken Term Detection for Zero Resource Languages
    Pandia, Karthik D. S.
    Saranya, M. S.
    Murthy, Hema A.
    2016 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS (SPCOM), 2016,
  • [8] SUBWORD-BASED SPOKEN TERM DETECTION IN AUDIO COURSE LECTURES
    Rose, Richard
    Norouzian, Atta
    Reddy, Aarthi
    Coy, Andre
    Gupta, Vishwa
    Karafiat, Martin
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 5282 - 5285
  • [9] Spoken term detection system based on combination of LVCSR and phonetic search
    Szoeke, Igor
    Fapso, Michal
    Karafiat, Martin
    Burget, Lukas
    Grezl, Frantisek
    Schwarz, Petr
    Glembek, Ondrej
    Matejka, Pavel
    Kopecky, Jiri
    Cernocky, Jan Honza
    MACHINE LEARNING FOR MULTIMODAL INTERACTION, 2008, 4892 : 237 - 247
  • [10] USING PARALLEL TOKENIZERS WITH DTW MATRIX COMBINATION FOR LOW-RESOURCE SPOKEN TERM DETECTION
    Wang, Haipeng
    Lee, Tan
    Leung, Cheung-Chi
    Ma, Bin
    Li, Haizhou
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 8545 - 8549