The CRINGE Loss: Learning what language not to model

被引:0
|
作者
Adolphs, Leonard [1 ,2 ]
Gao, Tianyu [1 ,3 ]
Xu, Jing [1 ]
Shuster, Kurt [1 ]
Sukhbaatar, Sainbayar [1 ]
Weston, Jason [1 ]
机构
[1] Meta AI, Menlo Pk, CA 94025 USA
[2] Swiss Fed Inst Technol, Zurich, Switzerland
[3] Princeton Univ, Princeton, NJ USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Standard language model training employs gold human documents or human-human interaction data, and treats all training data as positive examples. Growing evidence shows that even with very large amounts of positive training data, issues remain that can be alleviated with relatively small amounts of negative data - examples of what the model should not do. In this work, we propose a novel procedure to train with such data called the CRINGE loss (ContRastive Iterative Negative GEneration). We show the effectiveness of this approach across three different experiments on the tasks of safe generation, contradiction avoidance, and open-domain dialogue. Our models outperform multiple strong baselines and are conceptually simple, easy to train and implement.
引用
收藏
页码:8854 / 8874
页数:21
相关论文
共 50 条
  • [31] What are large language models supposed to model?
    Blank, Idan A.
    TRENDS IN COGNITIVE SCIENCES, 2023, 27 (11) : 987 - 989
  • [32] Language-learning holidays: what motivates people to learn a minority language?
    O'Rourke, Bernadette
    DePalma, Renee
    INTERNATIONAL JOURNAL OF MULTILINGUALISM, 2017, 14 (04) : 332 - 349
  • [33] Pantun with Quantum Learning Model as A Media Language Learning
    Sabhan
    PROCEEDINGS OF THE FIRST INDONESIAN COMMUNICATION FORUM OF TEACHER TRAINING AND EDUCATION FACULTY LEADERS INTERNATIONAL CONFERENCE ON EDUCATION 2017 (ICE 2017), 2017, 174 : 477 - 480
  • [34] Orthogonal Subspace Learning for Language Model Continual Learning
    Wang, Xiao
    Chen, Tianze
    Ge, Qiming
    Xia, Han
    Bao, Rong
    Zheng, Rui
    Zhang, Qi
    Gui, Tao
    Huang, Xuanjing
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 10658 - 10671
  • [35] SHG Model for English Language Learning
    Yin, Yi-Qun
    INTERNATIONAL CONFERENCE ON EDUCATION SCIENCE AND HUMAN DEVELOPMENT (ESHD 2014), 2014, : 6 - 9
  • [36] Revenue model for language learning applications
    Lee, Sung Young
    International Journal of Multimedia and Ubiquitous Engineering, 2015, 10 (11): : 153 - 160
  • [37] CRITICAL AGE MODEL OF LANGUAGE LEARNING
    GOLUB, LS
    LANGUAGE ARTS, 1975, 52 (08) : 1097 - 1103
  • [38] E-LEARNING IN FOREIGN LANGUAGE TEACHING: WHAT IS GAINED AND WHAT IS LOST
    Grosu, Lucia-Mihaela
    David, Irina
    QUALITY AND EFFICIENCY IN E-LEARNING, VOL 3, 2013, : 298 - 303
  • [39] About Language Learning by Children and Adolescents With Hearing Loss Who Are Learning More Than One Spoken Language
    Nelson, Nickola Wolf
    Troia, Gary A.
    TOPICS IN LANGUAGE DISORDERS, 2018, 38 (03) : 167 - 168
  • [40] What Contributes to Student Language Learning Satisfaction and Achievement with Learning Management Systems?
    Li, Hanxue
    Ni, Aohua
    BEHAVIORAL SCIENCES, 2024, 14 (04)