Using Syntax in Large-Scale Audio Document Translation

被引:0
|
作者
Zheng, Jing [1 ]
Ayan, Necip Fazil [1 ]
Wang, Wen [1 ]
Burkett, David [2 ]
机构
[1] SRI Int, Speech Technol & Res Lab, 333 Ravenswood Ave, Menlo Pk, CA 94025 USA
[2] Univ Calif Berkeley, EECS Dept, Berkeley, CA 94720 USA
关键词
syntax; machine translation; audio document;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, the use of syntax has very effectively improved machine translation (MT) quality in many text translation tasks. However, using syntax in speech translation poses additional challenges because of disfluencies and other spoken language phenomena, and of errors introduced by automatic speech recognition (ASR). In this paper, we investigate the effect of using syntax in a large-scale audio document translation task targeting broadcast news and broadcast conversations. We do so by comparing the performance of three synchronous context-free grammar based translation approaches: 1) hierarchical phrase-based translation, 2) syntax-augmented MT, and 3) string-to-dependency MT. The results show a positive effect of explicitly using syntax when translating broadcast news, but no benefit when translating broadcast conversations. The results indicate that improving the robustness of syntactic systems against conversational language style is important to their success and requires future effort.
引用
收藏
页码:444 / +
页数:2
相关论文
共 50 条
  • [41] Large-scale unrestricted identification of post-translation modifications using tandem mass spectrometry
    Havilio, Moshe
    Wool, Assaf
    ANALYTICAL CHEMISTRY, 2007, 79 (04) : 1362 - 1368
  • [42] Using diazomethane in large-scale synthesis
    Archibald, T
    MANUFACTURING CHEMIST, 2000, 71 (02): : 20 - 21
  • [43] Effective geometric restoration of distorted historical document for large-scale digitisation
    Yang, Po
    Antonacopoulos, Apostolos
    Clausner, Christian
    Pletschacher, Stefan
    Qi, Jun
    IET IMAGE PROCESSING, 2017, 11 (10) : 841 - 853
  • [44] A systematic study on parameter correlations in large-scale duplicate document detection
    Shaozhi Ye
    Ji-Rong Wen
    Wei-Ying Ma
    Knowledge and Information Systems, 2008, 14 : 217 - 232
  • [45] DocRED: A Large-Scale Document-Level Relation Extraction Dataset
    Yao, Yuan
    Ye, Deming
    Li, Peng
    Han, Xu
    Lin, Yankai
    Liu, Zhenghao
    Liu, Zhiyuan
    Huang, Lixin
    Zhou, Jie
    Sun, Maosong
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 764 - 777
  • [46] A systematic study on parameter correlations in large-scale duplicate document detection
    Ye, Shaozhi
    Wen, Ji-Rong
    Ma, Wei-Ying
    KNOWLEDGE AND INFORMATION SYSTEMS, 2008, 14 (02) : 217 - 232
  • [47] Revisiting Document Representations for Large-Scale Zero-Shot Learning
    Kil, Jihyung
    Chao, Wei-Lun
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 3117 - 3128
  • [48] Bridging large-scale neuronal recordings and large-scale network models using dimensionality reduction
    Williamson, Ryan C.
    Doiron, Brent
    Smith, Matthew A.
    Yu, Byron M.
    CURRENT OPINION IN NEUROBIOLOGY, 2019, 55 : 40 - 47
  • [49] LARGE-SCALE INTEGRATED-CIRCUIT SYSTEM FOR DIGITAL AUDIO DISK PLAYER
    TAKEUCHI, T
    KOBAYASHI, M
    ARAI, T
    IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 1982, 28 (03) : 360 - 362
  • [50] Performances of Low-level Audio Classifiers for Large-scale Music Similarity
    Osmalskyj, Julien
    Van Droogenbroeck, Marc
    Embrechts, Jean-Jacques
    21ST INTERNATIONAL CONFERENCE ON SYSTEMS, SIGNALS AND IMAGE PROCESSING (IWSSIP 2014), 2014, : 91 - 94