TeaForN: Teacher-Forcing with N-grams

被引:0
|
作者
Goodman, Sebastian [1 ]
Ding, Nan [1 ]
Soricut, Radu [1 ]
机构
[1] Google Res, Venice, CA 90291 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Sequence generation models trained with teacher-forcing suffer from issues related to exposure bias and lack of differentiability across timesteps. Our proposed method, Teacher-Forcing with N-grams (TeaForN), addresses both these problems directly, through the use of a stack of N decoders trained to decode along a secondary time axis that allows model-parameter updates based on N prediction steps. TeaForN can be used with a wide class of decoder architectures and requires minimal modifications from a standard teacher-forcing setup. Empirically, we show that TeaForN boosts generation quality on one Machine Translation benchmark, WMT 2014 English-French, and two News Summarization benchmarks, CNN/Dailymail and Gigaword.
引用
收藏
页码:8704 / 8717
页数:14
相关论文
共 50 条
  • [31] Applications of N-grams in textual information systems
    Robertson, AM
    Willett, P
    JOURNAL OF DOCUMENTATION, 1998, 54 (01) : 48 - 69
  • [32] s-grams:: Defining generalized n-grams for information retrieval
    Jarvelin, Anni
    Jarvelin, Antti
    Jarvelin, Kalervo
    INFORMATION PROCESSING & MANAGEMENT, 2007, 43 (04) : 1005 - 1019
  • [33] Phylogenetic analysis of mitochondrial genomes with n-grams
    Huang, Hsin-Hsiung
    GENETIC EPIDEMIOLOGY, 2015, 39 (07) : 558 - 558
  • [34] Automatic annotation of dialogues using n-grams
    Martinez-Hinarejos, Carlos D.
    TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2006, 4188 : 653 - 660
  • [35] Author verification using syntactic N-grams
    Center for Computing Research , Instituto Politécnico Nacional , Mexico City, Mexico
    CEUR Workshop Proc.,
  • [36] Interpolated N-Grams for Model Based Testing
    Tonella, Paolo
    Tiella, Roberto
    Cu Duy Nguyen
    36TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2014), 2014, : 562 - 572
  • [37] Part of speech n-grams and Information Retrieval
    Lioma, Christina
    van Rijsbergen, C. J. Keith
    REVUE FRANCAISE DE LINGUISTIQUE APPLIQUEE, 2008, 13 (01): : 9 - 22
  • [38] Building Wikipedia N-grams with Apache Spark
    Esmaeilzadeh, Armin
    Cacho, Jorge Ramon Fonseca
    Taghva, Kazem
    Kambar, Mina Esmail Zadeh Nojoo
    Hajiali, Mahdi
    INTELLIGENT COMPUTING, VOL 2, 2022, 507 : 672 - 684
  • [39] Using N-grams for arabic text searching
    Mustafa, SH
    Al-Radaideh, QA
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2004, 55 (11): : 1002 - 1007
  • [40] Mining generalized character n-grams in large corpora
    Marques, Nuno C.
    Braud, Agnès
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2003, 2902 : 419 - 423