Strategies for improving low resource speech to text translation relying on pre-trained ASR models

被引:0
|
作者
Kesiraju, Santosh [1 ]
Sarvas, Marek [1 ]
Pavlicek, Tomas [2 ]
Macaire, Cecile [3 ]
Ciuba, Alejandro [4 ]
机构
[1] Brno Univ Technol, Speech FIT, Brno, Czech Republic
[2] Phonexia, Brno, Czech Republic
[3] Univ Grenoble Alpes, Grenoble, France
[4] Univ Pittsburgh, Pittsburgh, PA 15260 USA
来源
基金
欧盟地平线“2020”; 美国国家科学基金会;
关键词
speech translation; low-resource; multilingual; speech recognition;
D O I
10.21437/Interspeech.2023-2506
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents techniques and findings for improving the performance of low-resource speech to text translation (ST). We conducted experiments on both simulated and real-low resource setups, on language pairs English - Portuguese, and Tamasheq - French respectively. Using the encoder-decoder framework for ST, our results show that a multilingual automatic speech recognition system acts as a good initialization under low-resource scenarios. Furthermore, using the CTC as an additional objective for translation during training and decoding helps to reorder the internal representations and improves the final translation. Through our experiments, we try to identify various factors (initializations, objectives, and hyper-parameters) that contribute the most for improvements in lowresource setups. With only 300 hours of pre-training data, our model achieved 7.3 BLEU score on Tamasheq - French data, outperforming prior published works from IWSLT 2022 by 1.6 points.
引用
收藏
页码:2148 / 2152
页数:5
相关论文
共 50 条
  • [41] Radical-vectors with pre-trained models for Chinese Text Classification
    Yin, Guoqing
    Wu, Junmin
    Zhao, Guochao
    2022 EURO-ASIA CONFERENCE ON FRONTIERS OF COMPUTER SCIENCE AND INFORMATION TECHNOLOGY, FCSIT, 2022, : 12 - 15
  • [42] Towards unifying pre-trained language models for semantic text exchange
    Miao, Jingyuan
    Zhang, Yuqi
    Jiang, Nan
    Wen, Jie
    Pei, Kanglu
    Wan, Yue
    Wan, Tao
    Chen, Honglong
    WIRELESS NETWORKS, 2024, 30 (07) : 6385 - 6398
  • [43] Recipes for Adapting Pre-trained Monolingual and Multilingual Models to Machine Translation
    Stickland, Asa Cooper
    Li, Xian
    Ghazvininejad, Marjan
    16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 3440 - 3453
  • [44] Controlling Translation Formality Using Pre-trained Multilingual Language Models
    Rippeth, Elijah
    Agrawal, Sweta
    Carpuat, Marine
    PROCEEDINGS OF THE 19TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE TRANSLATION (IWSLT 2022), 2022, : 327 - 340
  • [45] Reinforced Curriculum Learning on Pre-Trained Neural Machine Translation Models
    Zhao, Mingjun
    Wu, Haijiang
    Niu, Di
    Wang, Xiaoli
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 9652 - 9659
  • [46] Efficient Fine-Tuning for Low-Resource Tibetan Pre-trained Language Models
    Zhou, Mingjun
    Daiqing, Zhuoma
    Qun, Nuo
    Nyima, Tashi
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING-ICANN 2024, PT VII, 2024, 15022 : 410 - 422
  • [47] Comparing pre-trained language models for Spanish hate speech detection
    Miriam Plaza-del-Arco, Flor
    Dolores Molina-Gonzalez, M.
    Alfonso Urena-Lopez, L.
    Teresa Martin-Valdivia, M.
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 166
  • [48] Improving data augmentation for low resource speech-to-text translation with diverse paraphrasing
    Mi, Chenggang
    Xie, Lei
    Zhang, Yanning
    NEURAL NETWORKS, 2022, 148 : 194 - 205
  • [49] Improving Quality Estimation of Machine Translation by Using Pre-trained Language Representation
    Miao, Guoyi
    Di, Hui
    Xu, Jinan
    Yang, Zhongcheng
    Chen, Yufeng
    Ouchi, Kazushige
    MACHINE TRANSLATION, CCMT 2019, 2019, 1104 : 11 - 22
  • [50] Low-Resource Speech-to-Text Translation
    Bansal, Sameer
    Kamper, Herman
    Livescu, Karen
    Lopez, Adam
    Goldwater, Sharon
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1298 - 1302