Zero-resource audio-only spoken term detection based on a combination of template matching techniques

被引：0

作者：

Muscariello, Armando ^{[1
]}

Gravier, Guillaume ^{[1
]}

Bimbot, Frederic ^{[1
]}

机构：

[1] IRISA CNRS UMR 6074, Paris, France

来源：

12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5 | 2011年

关键词：

spoken term detection; template matching; unsupervised learning; posterior features;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Spoken term detection is a well-known information retrieval task that seeks to extract contentful information from audio by locating occurrences of known query words of interest. This paper describes a zero-resource approach to such task based on pattern matching of spoken term queries at the acoustic level. The template matching module comprises the cascade of a segmental variant of dynamic time warping and a self-similarity matrix comparison to further improve robustness to speech variability. This solution notably differs from more traditional train and test methods that, while shown to be very accurate, rely upon the availability of large amounts of linguistic resources. We evaluate our framework on different parameterizations of the speech templates: raw MFCC features and Gaussian posteriorgrams, French and English phonetic posteriorgrams output by two different state of the art phoneme recognizers.

引用

页码：928 / 931

页数：4

共 25 条

[1] Simulating Zero-Resource Spoken Term Discovery
White, Jerome
Oard, Douglas W.
CIKM'17: PROCEEDINGS OF THE 2017 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2017, : 2371 - 2374
[2] Spoken Term Detection of Zero-Resource Language using Machine Learning
Ito, Akinori
Koizumi, Masatoshi
2018 INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATION TECHNOLOGY (ICIIT 2018), 2018, : 45 - 49
[3] Self-Paced Pattern Augmentation for Spoken Term Detection in Zero-Resource
Sudhakar, P.
Rao, Sreenivasa K.
Mitra, Pabitra
INTERSPEECH 2023, 2023, : 1618 - 1622
[4] ZERO-RESOURCE SPOKEN TERM DETECTION USING HIERARCHICAL GRAPH-BASED SIMILARITY SEARCH
Aoyama, Kazuo
Ogawa, Atsunori
Hattori, Takashi
Hori, Takaaki
Nakamura, Atsushi
2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[5] A Novel Zero-Resource Spoken Term Detection Using Affinity Kernel Propagation with Acoustic Feature Map
Sudhakar P.
Rao K.S.
Mitra P.
SN Computer Science, 4 (3)
[6] ZERO RESOURCE GRAPH-BASED CONFIDENCE ESTIMATION FOR OPEN VOCABULARY SPOKEN TERM DETECTION
Norouzian, Atta
Rose, Richard
Ghalehjegh, Sina Hamidi
Jansen, Aren
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 8292 - 8296
[7] A Fast Query-by-Example Spoken Term Detection for Zero Resource Languages
Pandia, Karthik D. S.
Saranya, M. S.
Murthy, Hema A.
2016 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS (SPCOM), 2016,
[8] SUBWORD-BASED SPOKEN TERM DETECTION IN AUDIO COURSE LECTURES
Rose, Richard
Norouzian, Atta
Reddy, Aarthi
Coy, Andre
Gupta, Vishwa
Karafiat, Martin
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 5282 - 5285
[9] Spoken term detection system based on combination of LVCSR and phonetic search
Szoeke, Igor
Fapso, Michal
Karafiat, Martin
Burget, Lukas
Grezl, Frantisek
Schwarz, Petr
Glembek, Ondrej
Matejka, Pavel
Kopecky, Jiri
Cernocky, Jan Honza
MACHINE LEARNING FOR MULTIMODAL INTERACTION, 2008, 4892 : 237 - 247
[10] USING PARALLEL TOKENIZERS WITH DTW MATRIX COMBINATION FOR LOW-RESOURCE SPOKEN TERM DETECTION
Wang, Haipeng
Lee, Tan
Leung, Cheung-Chi
Ma, Bin
Li, Haizhou
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 8545 - 8549

← 1 2 3 →