Cross-lingual transfer learning during supervised training in low resource scenarios

被引:0
|
作者
Das, Amit [1 ]
Hasegawa-Johnson, Mark [1 ]
机构
[1] Univ Illinois, Dept Elect & Comp Engn, 1406 W Green St, Urbana, IL 61801 USA
基金
美国国家科学基金会;
关键词
cross-lingual speech recognition; transfer learning; deep neural networks; hidden Markov models;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this study, transfer learning techniques are presented for cross-lingual speech recognition to mitigate the effects of limited availability of data in a target language using data from richly resourced source languages. A maximum likelihood (ML) based regularization criterion is used to learn context-dependent Gaussian mixture model (GMM) based hidden Markov model (HMM) parameters for phones in target language using data from both target and source languages. Recognition results indicate improved HMM state alignments. The hidden layers of a deep neural network (DNN) are then initialized using unsupervised pre-training of a multilingual deep belief network (DBN). First, the DNN is fine-tuned using a modified cross entropy criterion that jointly uses HMM state alignments from both target and source languages. Second, another DNN fine-tuning technique is explored where the training is performed in a sequential manner - source language followed by the target language. Experiments conducted using varying amounts of target data indicate improvements in performance can be obtained using joint and sequential training of the DNN compared to existing techniques. Turkish and English were chosen to be the target and source languages respectively.
引用
收藏
页码:3531 / 3535
页数:5
相关论文
共 50 条
  • [1] CROSS-LINGUAL TRANSFER LEARNING FOR LOW-RESOURCE SPEECH TRANSLATION
    Khurana, Sameer
    Dawalatabad, Nauman
    Laurent, Antoine
    Vicente, Luis
    Gimeno, Pablo
    Mingote, Victoria
    Glass, James
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW 2024, 2024, : 670 - 674
  • [2] Unifying Cross-Lingual Transfer across Scenarios of Resource Scarcity
    Ansell, Alan
    Parovic, Marinela
    Vulic, Ivan
    Korhonen, Anna
    Ponti, Edoardo Maria
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 3980 - 3995
  • [3] Cross-Lingual Summarization Method Based on Joint Training and Self-Training in Low-Resource Scenarios
    Cheng, Shaohuan
    Tang, Yujia
    Liu, Qiao
    Chen, Wenyu
    Dianzi Keji Daxue Xuebao/Journal of the University of Electronic Science and Technology of China, 2024, 53 (05): : 762 - 770
  • [4] UniBridge: A Unified Approach to Cross-Lingual Transfer Learning for Low-Resource Languages
    Trinh Pham
    Le, Khoi M.
    Luu Anh Tuan
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 3168 - 3184
  • [5] Adversarial Cross-Lingual Transfer Learning for Slot Tagging of Low-Resource Languages
    He, Keqing
    Yan, Yuanmeng
    Xu, Weiran
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [6] Cross-lingual offensive speech identification with transfer learning for low-resource languages
    Shi, Xiayang
    Liu, Xinyi
    Xu, Chun
    Huang, Yuanyuan
    Chen, Fang
    Zhu, Shaolin
    COMPUTERS & ELECTRICAL ENGINEERING, 2022, 101
  • [7] Low-Resource Cross-Lingual Adaptive Training for Nigerian Pidgin
    Lin, Pin-Jie
    Saeed, Muhammed
    Chang, Ernie
    Scholman, Merel
    INTERSPEECH 2023, 2023, : 3954 - 3958
  • [8] Semi-supervised Learning on Cross-Lingual Sentiment Analysis with Space Transfer
    He, Xiaonan
    Zhang, Hui
    Chao, Wenhan
    Wang, Daqing
    2015 IEEE FIRST INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING SERVICE AND APPLICATIONS (BIGDATASERVICE 2015), 2015, : 371 - 377
  • [9] Cross-lingual and Cross-domain Transfer Learning for Automatic Term Extraction from Low Resource Data
    Hazem, Amir
    Bouhandi, Meriem
    Boudin, Florian
    Daille, Beatrice
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 648 - 662
  • [10] Translation Artifacts in Cross-lingual Transfer Learning
    Artetxe, Mikel
    Labaka, Gorka
    Agirre, Eneko
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 7674 - 7684