Understanding and Bridging the Modality Gap for Speech Translation

被引:0
|
作者
Fang, Qingkai [1 ,2 ]
Feng, Yang [1 ,2 ]
机构
[1] Chinese Acad Sci ICT CAS, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Beijing, Peoples R China
基金
国家重点研发计划;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
How to achieve better end-to-end speech translation (ST) by leveraging (text) machine translation (MT) data? Among various existing techniques, multi-task learning is one of the effective ways to share knowledge between ST and MT in which additional MT data can help to learn source-to-target mapping. However, due to the differences between speech and text, there is always a gap between ST and MT. In this paper, we first aim to understand this modality gap from the target-side representation differences, and link the modality gap to another well-known problem in neural machine translation: exposure bias. We find that the modality gap is relatively small during training except for some difficult cases, but keeps increasing during inference due to the cascading effect. To address these problems, we propose the Cross-modal Regularization with Scheduled Sampling (CRESS) method. Specifically, we regularize the output predictions of ST and MT, whose target-side contexts are derived by sampling between ground truth words and self-generated words with a varying probability. Furthermore, we introduce token-level adaptive training which assigns different training weights to target tokens to handle difficult cases with large modality gaps. Experiments and analysis show that our approach effectively bridges the modality gap, and achieves promising results in all eight directions of the MuST-C dataset.(1)
引用
收藏
页码:15864 / 15881
页数:18
相关论文
共 50 条
  • [21] Bridging the gap through understanding hip-hop
    Haddix, M
    HIGHER EDUCATION IN TRANSITION: THE POLITICS AND PRACTICES OF EQUITY, 1999, : 307 - 320
  • [22] Bridging the Gap in Understanding Bone Metastasis: A Multifaceted Perspective
    Elaasser, Basant
    Arakil, Nour
    Mohammad, Khalid S.
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2024, 25 (05)
  • [23] Understanding and bridging the energy performance gap in building retrofit
    Khoury, Jad
    Alameddine, Zeinab
    Hollmuller, Pierre
    CISBAT 2017 INTERNATIONAL CONFERENCE FUTURE BUILDINGS & DISTRICTS - ENERGY EFFICIENCY FROM NANO TO URBAN SCALE, 2017, 122 : 217 - 222
  • [24] Guest editorial: Bridging the semantic gap in multimedia understanding
    Yan, Yan
    Lu, Jiwen
    NEUROCOMPUTING, 2016, 208 : 1 - 2
  • [25] Bridging the Gap between Training and Inference for Neural Machine Translation
    Zhang, Wen
    Feng, Yang
    Liu, Qun
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 4790 - 4794
  • [26] Bridging the Gap between Reproducibility and Translation: Data Resources and Approaches
    Zeiss, Caroline J.
    Johnson, Linda K.
    ILAR JOURNAL, 2017, 58 (01) : CP3 - 3
  • [27] Bridging the Gap Between Paired and Unpaired Medical Image Translation
    Paavilainen, Pauliina
    Akram, Saad Ullah
    Kannala, Juho
    DEEP GENERATIVE MODELS, AND DATA AUGMENTATION, LABELLING, AND IMPERFECTIONS, 2021, 13003 : 35 - 44
  • [28] PRACTICAL ASPECTS OF TRANSLATION: BRIDGING THE GAP BETWEEN THEORY AND PRACTICE
    Libeg, Andrea
    EUROPEAN INTEGRATION: BETWEEN TRADITION AND MODERNITY, VOL 1, 2005, : 352 - 358
  • [29] Bridging the Gap between Training and Inference for Neural Machine Translation
    Zhang, Wen
    Feng, Yang
    Meng, Fandong
    You, Di
    Liu, Qun
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 4334 - 4343
  • [30] Bridging the gap: reflections on co-creation in knowledge translation
    Marjolijn Ketelaar
    Maureen Bult
    Marike Willems-op Het Veld
    Karen van Meeteren
    Marij Roebroeck
    Jeanine Voorman
    Research Involvement and Engagement, 10 (1)