Modeling Dialectal Variation for Swiss German Automatic Speech Recognition

被引：4

作者：

Khosravani, Abbas ^{[1
]}

Garner, Philip N. ^{[1
]}

Lazaridis, Alexandros ^{[2
]}

机构：

[1] Idiap Res Inst, Martigny, Switzerland

[2] Swisscom AG, Data Analyt & AI Grp, Bern, Switzerland

来源：

INTERSPEECH 2021 | 2021年

关键词：

Speech recognition; Wav2vec; dialectal lexicon; Swiss German; multi-dialect; Swisscom; voice assistant; TV Box;

D O I：

10.21437/Interspeech.2021-1735

中图分类号：

R36 [病理学]; R76 [耳鼻咽喉科学];

学科分类号：

100104 ; 100213 ;

摘要：

We describe a speech recognition system for Swiss German, a dialectal spoken language in German-speaking Switzerland. Swiss German has no standard orthography, with a significant variation in its written form. To alleviate the uncertainty associated with this variability, we automatically generate a lexicon from which multiple written forms of a given word in any dialect can be generated. The lexicon is built from a small (incomplete) handcrafted lexicon designed by linguistic experts and contains forms of common words in various Swiss German dialects. We exploit the powerful speech representation of self supervised acoustic pre-training (wav2vec) to address the lowresource nature of the spoken dialects. The proposed approach results in an overall relative improvement of 9% word error rate compared to one based on an expert-generated lexicon for our TV Box voice assistant application.

引用

页码：2896 / 2900

页数：5

共 50 条

[1] Data-Driven Pronunciation Modeling of Swiss German Dialectal Speech for Automatic Speech Recognition
Stadtschnitzer, Michael
Schmidt, Christoph
PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 3152 - 3156
[2] Discriminative pronunciation modeling for dialectal speech recognition
Lehr, Maider
Gorman, Kyle
Shafran, Izhak
15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 1458 - 1462
[3] Pronunciation Modeling for Dialectal Arabic Speech Recognition
Al-Haj, Hassan
Hsiao, Roger
Lane, Ian
Black, Alan W.
Waibel, Alex
2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 525 - 528
[4] Automatic Initial/Final Generation for Dialectal Chinese Speech Recognition
Liu, Linquan
Zheng, Thomas Fang
Wu, Wenhu
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 141 - 144
[5] Special issue on modeling pronunciation variation for automatic speech recognition
Strik, H
SPEECH COMMUNICATION, 1999, 29 (2-4) : 81 - 82
[6] Automatic speech recognition and intrinsic speech variation
Benzeguiba, M.
De Mori, R.
Deroo, O.
Dupont, S.
Erbes, T.
Jouvet, D.
Fissore, L.
Laface, R.
Mertins, A.
Ris, C.
Rose, R.
Tyagi, V.
Wellekens, C.
2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 5879 - 5882
[7] LEARNING TO TRANSLATE LOW-RESOURCED SWISS GERMAN DIALECTAL SPEECH INTO STANDARD GERMAN TEXT
Khosravani, Abbas
Garner, Philip N.
Lazaridis, Alexandros
2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 817 - 823
[8] A Swiss German Dictionary: Variation in Speech and Writing
Schmidt, Larissa
Linder, Lucy
Djambazovska, Sandra
Lazaridis, Alexandros
Samardzic, Tanja
Musat, Claudiu
PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 2720 - 2725
[9] Machine learning based sample extraction for automatic speech recognition using dialectal Assamese speech
Agarwalla, Swapna
Sarma, Kandarpa Kumar
NEURAL NETWORKS, 2016, 78 : 97 - 111
[10] Cross-Lingual Acoustic modeling for Dialectal Arabic Speech Recognition
Elmahdy, Mohamed
Gruhn, Rainer
Minker, Wolfgang
Abdennadher, Slim
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 873 - +

← 1 2 3 4 5 →