TeaForN: Teacher-Forcing with N-grams

被引:0
|
作者
Goodman, Sebastian [1 ]
Ding, Nan [1 ]
Soricut, Radu [1 ]
机构
[1] Google Res, Venice, CA 90291 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Sequence generation models trained with teacher-forcing suffer from issues related to exposure bias and lack of differentiability across timesteps. Our proposed method, Teacher-Forcing with N-grams (TeaForN), addresses both these problems directly, through the use of a stack of N decoders trained to decode along a secondary time axis that allows model-parameter updates based on N prediction steps. TeaForN can be used with a wide class of decoder architectures and requires minimal modifications from a standard teacher-forcing setup. Empirically, we show that TeaForN boosts generation quality on one Machine Translation benchmark, WMT 2014 English-French, and two News Summarization benchmarks, CNN/Dailymail and Gigaword.
引用
收藏
页码:8704 / 8717
页数:14
相关论文
共 50 条
  • [1] The Distribution of N-Grams
    Leo Egghe
    Scientometrics, 2000, 47 : 237 - 252
  • [2] Collocations and N-grams
    FREEBURY-JONES, D. A. R. R. E. N.
    RENAISSANCE AND REFORMATION, 2021, 44 (04) : 210 - 216
  • [3] The distribution of N-grams
    Egghe, L
    SCIENTOMETRICS, 2000, 47 (02) : 237 - 252
  • [4] IDF for Word N-grams
    Shirakawa, Masumi
    Hara, Takahiro
    Nishio, Shojiro
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2017, 36 (01)
  • [5] Syntactic n-grams in Computational Linguistics
    Shi, Feng
    Feng, Guohua
    NATURAL LANGUAGE ENGINEERING, 2023, 29 (05) : 1411 - 1413
  • [6] The Role of n-grams in Firstborns Identification
    Ramirez-de-la-Rosa, Gabriela
    Reyes-Meza, Veronica
    Villatoro-Tello, Esau
    Jimenez-Salazar, Hector
    Montes-y-Gomez, Manuel
    Villasenor-Pineda, Luis
    ADVANCES IN ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, MICAI 2015, PT I, 2015, 9413 : 95 - 106
  • [7] Which Granularity to Bootstrap a Multilingual Method of Document Alignment: Character N-grams or Word N-grams?
    Lecluze, Charlotte
    Rigouste, Lois
    Giguet, Emmanuel
    Lucas, Nadine
    CORPUS RESOURCES FOR DESCRIPTIVE AND APPLIED STUDIES. CURRENT CHALLENGES AND FUTURE DIRECTIONS: SELECTED PAPERS FROM THE 5TH INTERNATIONAL CONFERENCE ON CORPUS LINGUISTICS (CILC2013), 2013, 95 : 473 - 481
  • [8] SPEECH RECOGNITION USING FUNCTION-WORD N-GRAMS AND CONTENT-WORD N-GRAMS
    ISOTANI, R
    MATSUNAGA, S
    SAGAYAMA, S
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 1995, E78D (06) : 692 - 697
  • [9] The subjective frequency of word n-grams
    Shaoul, Cyrus
    Westbury, Chris F.
    Baayen, R. Harald
    PSIHOLOGIJA, 2013, 46 (04) : 497 - 537
  • [10] Implicit N-grams Induced by Recurrence
    Sun, Xiaobing
    Lu, Wei
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 1624 - 1639