TeaForN: Teacher-Forcing with N-grams

被引:0
|
作者
Goodman, Sebastian [1 ]
Ding, Nan [1 ]
Soricut, Radu [1 ]
机构
[1] Google Res, Venice, CA 90291 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Sequence generation models trained with teacher-forcing suffer from issues related to exposure bias and lack of differentiability across timesteps. Our proposed method, Teacher-Forcing with N-grams (TeaForN), addresses both these problems directly, through the use of a stack of N decoders trained to decode along a secondary time axis that allows model-parameter updates based on N prediction steps. TeaForN can be used with a wide class of decoder architectures and requires minimal modifications from a standard teacher-forcing setup. Empirically, we show that TeaForN boosts generation quality on one Machine Translation benchmark, WMT 2014 English-French, and two News Summarization benchmarks, CNN/Dailymail and Gigaword.
引用
收藏
页码:8704 / 8717
页数:14
相关论文
共 50 条
  • [21] Robust polyphonic music retrieval with N-grams
    Doraisamy, S
    Rüeger, S
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2003, 21 (01) : 53 - 70
  • [22] Detection of Opinion Spam with Character n-grams
    Hernandez Fusilier, Donato
    Montes-y-Gomez, Manuel
    Rosso, Paolo
    Guzman Cabrera, Rafael
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING (CICLING 2015), PT II, 2015, 9042 : 285 - 294
  • [23] Pixel N-grams for mammographic lesion classification
    Kulkarni, Pradnya
    Stranieri, Andrew
    Ugon, Julien
    Mittal, Manish
    Kulkarni, Siddhivinayak
    2017 2ND INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS, COMPUTING AND IT APPLICATIONS (CSCITA), 2017, : 107 - 111
  • [24] Protein classification using modified n-grams and skip-grams
    Islam, S. M. Ashiqul
    Heil, Benjamin J.
    Kearney, Christopher Michel
    Baker, Erich J.
    BIOINFORMATICS, 2018, 34 (09) : 1481 - 1487
  • [26] An effective combination of different order N-grams
    Zhang, S
    Dong, N
    PACLIC 17: Language, Information and Computation, Proceedings, 2003, : 251 - 256
  • [27] Plagiarism Detection Using Stopword n-grams
    Stamatatos, Efstathios
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2011, 62 (12): : 2512 - 2527
  • [28] Automatic statistical translation based on n-grams
    Oliver, Antonio
    Badia, Toni
    Boleda, Gemma
    Melero, Maite
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2005, (35): : 77 - 84
  • [29] Spam detection using character N-grams
    Kanaris, Ioannis
    Kanaris, Konstantinos
    Stamatatos, Efstathios
    ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2006, 3955 : 95 - 104
  • [30] Reconstructing Textual Documents from n-grams
    Galle, Matthias
    Tealdi, Matias
    KDD'15: PROCEEDINGS OF THE 21ST ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2015, : 329 - 338