Using Syntax in Large-Scale Audio Document Translation

被引:0
|
作者
Zheng, Jing [1 ]
Ayan, Necip Fazil [1 ]
Wang, Wen [1 ]
Burkett, David [2 ]
机构
[1] SRI Int, Speech Technol & Res Lab, 333 Ravenswood Ave, Menlo Pk, CA 94025 USA
[2] Univ Calif Berkeley, EECS Dept, Berkeley, CA 94720 USA
关键词
syntax; machine translation; audio document;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, the use of syntax has very effectively improved machine translation (MT) quality in many text translation tasks. However, using syntax in speech translation poses additional challenges because of disfluencies and other spoken language phenomena, and of errors introduced by automatic speech recognition (ASR). In this paper, we investigate the effect of using syntax in a large-scale audio document translation task targeting broadcast news and broadcast conversations. We do so by comparing the performance of three synchronous context-free grammar based translation approaches: 1) hierarchical phrase-based translation, 2) syntax-augmented MT, and 3) string-to-dependency MT. The results show a positive effect of explicitly using syntax when translating broadcast news, but no benefit when translating broadcast conversations. The results indicate that improving the robustness of syntactic systems against conversational language style is important to their success and requires future effort.
引用
收藏
页码:444 / +
页数:2
相关论文
共 50 条
  • [1] A Large-Scale UAV Audio Dataset and Audio-Based UAV Classification Using CNN
    Wang, Yaqin
    Chu, Zhiwei
    Ku, Ilmun
    Smith, E. Cho
    Matson, Eric T.
    2022 SIXTH IEEE INTERNATIONAL CONFERENCE ON ROBOTIC COMPUTING, IRC, 2022, : 186 - 189
  • [2] CNN ARCHITECTURES FOR LARGE-SCALE AUDIO CLASSIFICATION
    Hershey, Shawn
    Chaudhuri, Sourish
    Ellis, Daniel P. W.
    Gemmeke, Jort F.
    Jansen, Aren
    Moore, R. Channing
    Plakal, Manoj
    Platt, Devin
    Saurous, Rif A.
    Seybold, Bryan
    Slaney, Malcolm
    Weiss, Ron J.
    Wilson, Kevin
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 131 - 135
  • [3] Efficient large-scale multichannel audio coding
    Sandnes, FE
    PROCEEDINGS OF THE 27TH EUROMICRO CONFERENCE - 2001: A NET ODYSSEY, 2001, : 392 - 399
  • [4] Evaluation challenges in large-scale document summarization
    Radev, DR
    Teufel, S
    Saggion, H
    Lam, W
    Blitzer, J
    Qi, H
    41ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, 2003, : 375 - 382
  • [5] ON ADVERSARIAL ROBUSTNESS OF LARGE-SCALE AUDIO VISUAL LEARNING
    Li, Juncheng B.
    Qu, Shuhui
    Li, Xinjian
    Huang, Po-Yao
    Metze, Florian
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 231 - 235
  • [6] VGGSOUND: A LARGE-SCALE AUDIO-VISUAL DATASET
    Chen, Honglie
    Xie, Weidi
    Vedaldi, Andrea
    Zisserman, Andrew
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 721 - 725
  • [7] Machine Translation for Subtitling: A Large-Scale Evaluation
    Etchegoyhen, Thierry
    Bywood, Lindsay
    Fishel, Mark
    Georgakopoulou, Panayota
    Jiang, Jie
    van Loenhout, Gerard
    del Pozo, Arantza
    Maucec, Mirjam Sepesy
    Turner, Anja
    Volk, Martin
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014,
  • [8] Microsoft Translator at WMT 2019: Towards Large-Scale Document-Level Neural Machine Translation
    Junczys-Dowmunt, Marcin
    FOURTH CONFERENCE ON MACHINE TRANSLATION (WMT 2019), 2019, : 225 - 233
  • [9] PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition
    Kong, Qiuqiang
    Cao, Yin
    Iqbal, Turab
    Wang, Yuxuan
    Wang, Wenwu
    Plumbley, Mark D.
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 2880 - 2894
  • [10] Learning General Audio Representations With Large-Scale Training of Patchout Audio Transformers
    Koutini, Khaled
    Masoudian, Shahed
    Schmid, Florian
    Eghbal-zadeh, Hamid
    Schlueter, Jan
    Widmer, Gerhard
    HEAR: HOLISTIC EVALUATION OF AUDIO REPRESENTATIONS, VOL 166, 2021, 166 : 65 - 88