Universal attribute characterization of spoken languages for automatic spoken language recognition

被引:41
|
作者
Siniscalchi, Sabato Marco [1 ]
Reed, Jeremy [2 ]
Svendsen, Torbjorn [3 ]
Lee, Chin-Hui [4 ]
机构
[1] Kore Univ Enna, Fac Engn & Architecture, Enna, Sicily, Italy
[2] Georgia Inst Technol, Georgia Tech Res Inst, Atlanta, GA 30332 USA
[3] Norwegian Univ Sci & Technol, Dept Elect & Telecommun, N-7491 Trondheim, Norway
[4] Georgia Inst Technol, Sch Elect & Comp Engn, Atlanta, GA 30332 USA
来源
COMPUTER SPEECH AND LANGUAGE | 2013年 / 27卷 / 01期
关键词
Spoken language recognition; Vector space model; Latentsemantic analysis; Artificial neural network; Support vectormachine; Phonetic features; NEURAL-NETWORKS; DESIGN;
D O I
10.1016/j.csl.2012.05.001
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a novel universal acoustic characterization approach to spoken language recognition (LRE). The key idea is to describe any spoken language with a common set of fundamental units that can be defined "universally" across all spoken languages. In this study, speech attributes, such as manner and place of articulation, are chosen to form this unit inventory and used to build a set of language-universal attribute models with data-driven modeling techniques. The vector space modeling approach to LRE is adopted, where a spoken utterance is first decoded into a sequence of attributes independently of its language. Then, a feature vector is generated by using co-occurrence statistics of manner or place units, and the final LRE decision is implemented with a vector space language classifier. Several architectural configurations will be studied, and it will be shown that best performance is attained using a maximal figure-of-merit language classifier. Experimental evidence not only demonstrates the feasibility of the proposed techniques, but it also shows that the proposed technique attains comparable performance to standard approaches on the LRE tasks investigated in this work when the same experimental conditions are adopted. (C) 2012 Elsevier Ltd. All rights reserved.
引用
收藏
页码:209 / 227
页数:19
相关论文
共 50 条
  • [21] Spoken Language Treebanks in Universal Dependencies: an Overview
    Dobrovoljc, Kaja
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 1798 - 1806
  • [22] Automatic identification of spontaneously spoken languages with neural networks
    Schultz, T
    Soltau, H
    NATURAL LANGUAGE PROCESSING AND SPEECH TECHNOLOGY: RESULTS OF THE 3RD KONVENS CONFERENCE, 1996, : 102 - 110
  • [23] Automatic disambiguation of morphosyntax in spoken language corpora
    Christophe Parisse
    Marie-thérèse Le Normand
    Behavior Research Methods, Instruments, & Computers, 2000, 32 : 468 - 481
  • [24] Automatic disambiguation of morphosyntax in spoken language corpora
    Parisse, C
    Le Normand, MT
    BEHAVIOR RESEARCH METHODS INSTRUMENTS & COMPUTERS, 2000, 32 (03): : 468 - 481
  • [25] Gesture links language and cognition for spoken and signed languages
    Kita, Sotaro
    Emmorey, Karen
    NATURE REVIEWS PSYCHOLOGY, 2023, 2 (07): : 407 - 420
  • [26] Arabic/English automatic spoken language identification
    Nofal, Maged
    Abdel-Reheem, Esam
    El Henawy, Hadia
    IEEE Pacific RIM Conference on Communications, Computers, and Signal Processing - Proceedings, 1999, : 400 - 403
  • [27] Gesture links language and cognition for spoken and signed languages
    Sotaro Kita
    Karen Emmorey
    Nature Reviews Psychology, 2023, 2 : 407 - 420
  • [28] Diversity across sign languages and spoken languages: Implications for language universals
    Cormier, Kearsy
    Schembri, Adam
    Woll, Bencie
    LINGUA, 2010, 120 (12) : 2664 - 2667
  • [29] Spoken words versus spoken language
    Jerger, James
    JOURNAL OF THE AMERICAN ACADEMY OF AUDIOLOGY, 2006, 17 (07)
  • [30] DenseRecognition of Spoken Languages
    Chakraborty, Jaybrata
    Chakraborty, Bappaditya
    Bhattacharya, Ujjwal
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 9674 - 9681