On the Utility of Syllable-Based Acoustic Models for Pronunciation Variation Modelling

被引：0

作者：

Annika Hämäläinen

Lou Boves

Johan de Veth

Louis ten Bosch

机构：

[1] Radboud University Nijmegen,Centre for Language and Speech Technology (CLST), Faculty of Arts

来源：

EURASIP Journal on Audio, Speech, and Music Processing | / 2007卷

关键词：

Acoustics; Speech Recognition; Substantial Effect; Recognition Performance; Considerable Improvement;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Recent research on the TIMIT corpus suggests that longer-length acoustic models are more appropriate for pronunciation variation modelling than the context-dependent phones that conventional automatic speech recognisers use. However, the impressive speech recognition results obtained with longer-length models on TIMIT remain to be reproduced on other corpora. To understand the conditions in which longer-length acoustic models result in considerable improvements in recognition performance, we carry out recognition experiments on both TIMIT and the Spoken Dutch Corpus and analyse the differences between the two sets of results. We establish that the details of the procedure used for initialising the longer-length models have a substantial effect on the speech recognition results. When initialised appropriately, longer-length acoustic models that borrow their topology from a sequence of triphones cannot capture the pronunciation variation phenomena that hinder recognition performance the most.

引用

共 50 条

[1] On the Utility of Syllable-Based Acoustic Models for Pronunciation Variation Modelling
Hamalainen, Annika
Boves, Lou
de Veth, Johan
ten Bosch, Louis
EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2007, 2007 (1)
[2] PHRASE RECOGNIZER USING SYLLABLE-BASED ACOUSTIC MEASUREMENTS
JOHNSON, DH
WEINSTEIN, CJ
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1978, 26 (05): : 409 - 418
[3] Syllable-Based Acoustic Modeling with CTC-SMBR-LSTM
Qu, Zhongdi
Haghani, Parisa
Weinstein, Eugene
Moreno, Pedro
2017 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2017, : 173 - 177
[4] Pronunciation Modeling of Loanwords for Korean ASR Using Phonological Knowledge and Syllable-based Segmentation
Ryu, Hyuksu
Na, Minsu
Chung, Minhwa
2015 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2015, : 430 - 435
[5] Modelling pronunciation variation with single-path and multi-path syllable models: Issues to consider
Hamalainen, Annika
ten Bosch, Louis
Boves, Lou
SPEECH COMMUNICATION, 2009, 51 (02) : 130 - 150
[6] AUTOMATIC PROSODIC EVENTS DETECTION USING SYLLABLE-BASED ACOUSTIC AND SYNTACTIC FEATURES
Jeon, Je Hun
Liu, Yang
2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4565 - 4568
[7] Thai syllable-based information extraction using hidden Markov models
Narupiyakul, L
Thomas, C
Cercone, N
Sirinaovakul, B
COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, 2004, 2945 : 537 - 546
[8] Syllable-based Malay Word Stemmer
Lee, JunChoi
Othman, Rosita Mohamad
Mohamad, Nurul Zawiyah
2013 IEEE SYMPOSIUM ON COMPUTERS AND INFORMATICS (ISCI 2013), 2013,
[9] Syllable-based Compression for XML Documents
Chernik, Katsiaryna
Lansky, Jan
Galambos, Leo
DATESO 2006 - DATABASES, TEXTS, SPECIFICATIONS, OBJECTS: PROCEEDINGS OF THE 6TH ANNUAL INTERNATIONAL WORKSHOP, 2006, 176 : 21 - 31
[10] Syllable clustering and spectral discontinuity in syllable-based TTS systems
Chen, FX
2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 688 - 691

← 1 2 3 4 5 →