Towards Unsupervised Training of Speaker Independent Acoustic Models

被引:0
|
作者
Jansen, Aren [1 ]
Church, Kenneth [1 ]
机构
[1] Johns Hopkins Univ, Human Language Technol Ctr Excellence, Baltimore, MD 21218 USA
关键词
speaker independent acoustic models; unsupervised training; spectral clustering; SPEECH;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Can we automatically discover speaker independent phoneme-like subword units with zero resources in a surprise language? There have been a number of recent efforts to automatically discover repeated spoken terms without a recognizer. This paper investigates the feasibility of using these results as constraints for unsupervised acoustic model training. We start with a relatively small set of word types, as well as their locations in the speech. The training process assumes that repetitions of the same (unknown) word share the same (unknown) sequence of subword units. For each word type, we train a whole-word hidden Markov model with Gaussian mixture observation densities and collapse correlated states across the word types using spectral clustering. We find that the resulting state clusters align reasonably well along phonetic lines. In evaluating cross-speaker word similarity, the proposed techniques outperform both raw acoustic features and language-mismatched acoustic models.
引用
收藏
页码:1704 / 1707
页数:4
相关论文
共 50 条
  • [41] Improving Unsupervised Acoustic Word Embeddings using Speaker and Gender Information
    van Staden, Lisa
    Kamper, Herman
    2020 INTERNATIONAL SAUPEC/ROBMECH/PRASA CONFERENCE, 2020, : 533 - 538
  • [42] Unsupervised Acoustic Model Training for the Korean Language
    Laurent, Antoine
    Hartmann, William
    Lamel, Lori
    2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 469 - 473
  • [43] Enhancing acoustic models for robust speaker verification
    Nolazco-Flores, Juan A.
    Garcia-Perera, L. Paola
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4837 - 4840
  • [44] Lightly supervised and unsupervised acoustic model training
    Lamel, L
    Gauvain, JL
    Adda, G
    COMPUTER SPEECH AND LANGUAGE, 2002, 16 (01): : 115 - 129
  • [45] Speaker Adaptation in Sparse Subspace of Acoustic Models
    Jeong, Yongwon
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2013, E96D (06): : 1402 - 1405
  • [46] Independent and Automatic Evaluation of Speaker-Independent Acoustic-to-Articulatory Reconstruction
    Parrot, Maud
    Millet, Juliette
    Dunbar, Ewan
    INTERSPEECH 2020, 2020, : 3740 - 3744
  • [47] A study of generic models for unsupervised on-line speaker indexing
    Kwon, S
    Narayanan, S
    ASRU'03: 2003 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING ASRU '03, 2003, : 423 - 428
  • [48] On the Distribution of Speaker Verification Scores: Generative Models for Unsupervised Calibration
    Cumani, Sandro
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 547 - 562
  • [49] Domain adaptation towards speaker-independent ultrasound tongue imaging based articulatory-to-acoustic conversion
    You, Kang
    Xu, Kele
    Wang, Jilong
    Feng, Ming
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2023, 153 (03):
  • [50] ORTHOGONAL TRAINING FOR TEXT-INDEPENDENT SPEAKER VERIFICATION
    Zhu, Yingke
    Mak, Brian
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6584 - 6588