Improving Input-label Mapping with Demonstration Replay for In-context Learning

被引:0
|
作者
Gong, Zhuocheng [1 ,3 ]
Liu, Jiahao [2 ]
Wang, Qifan [3 ]
Wang, Jingang [2 ]
Cai, Xunliang [2 ]
Zhao, Dongyan [1 ,4 ,5 ]
Yan, Rui [6 ]
机构
[1] Peking Univ, Wangxuan Inst Comp Technol, Beijing, Peoples R China
[2] Meituan, Beijing, Peoples R China
[3] Meta AI, Menlo Pk, CA USA
[4] Natl Key Lab Gen Artificial Intelligence, Beijing, Peoples R China
[5] Beijing Inst Gen Artificial Intelligence, Beijing, Peoples R China
[6] Renmin Univ China, Gaoling Sch Artificial Intelligence, Beijing, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In-context learning (ICL) is an emerging capability of large autoregressive language models where a few input-label demonstrations are appended to the input to enhance the model's understanding of downstream NLP tasks, without directly adjusting the model parameters. The effectiveness of ICL can be attributed to the strong language modeling capabilities of large language models (LLMs), which enable them to learn the mapping between input and labels based on in-context demonstrations. Despite achieving promising results, the causal nature of language modeling in ICL restricts the attention to be backward only, i.e., a token only attends to its previous tokens, failing to capture the full input-label information and limiting the model's performance. In this paper, we propose a novel ICL method called Repeated Demonstration with Sliding Causal Attention, (RDSCA). Specifically, we duplicate later demonstrations and concatenate them to the front, allowing the model to 'observe' the later information even under the causal restriction. Besides, we introduce sliding causal attention, which customizes causal attention to avoid information leakage. Experimental results show that our method significantly improves the input-label mapping in ICL demonstrations. We also conduct an in-depth analysis of how to customize the causal attention without training, which has been an unexplored area in previous research.
引用
收藏
页码:14923 / 14934
页数:12
相关论文
共 20 条
  • [1] Unified Demonstration Retriever for In-Context Learning
    Li, Xiaonan
    Lv, Kai
    Yan, Hang
    Lin, Tianyang
    Wei, Zhu
    Ni, Yuan
    Xie, Guotong
    Wang, Xiaoling
    Qiu, Xipeng
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 4644 - 4668
  • [2] Mitigating Label Biases for In-context Learning
    Fei, Yu
    Hou, Yifan
    Chen, Zeming
    Bosselut, Antoine
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 14014 - 14031
  • [3] DCLMD: dynamic clustering and label mapping distribution for constructing in-context learning demonstrations
    Yongping Du
    Qi Zhang
    Shuyi Fu
    Ying Hou
    Honggui Han
    The Journal of Supercomputing, 81 (5)
  • [4] Not All Demonstration Examples are Equally Beneficial: Reweighting Demonstration Examples for In-Context Learning
    Yang, Zhe
    Dai, Damai
    Wang, Peiyi
    Sui, Zhifang
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 13209 - 13221
  • [5] Dr.ICL: Demonstration-Retrieved In-context Learning
    Man Luo
    Xin Xu
    Zhuyun Dai
    Panupong Pasupat
    Mehran Kazemi
    Chitta Baral
    Vaiva Imbrasaite
    Vincent Y Zhao
    Data Intelligence, 2024, 6 (04) : 909 - 922
  • [6] Exploring Effective Factors for Improving Visual In-Context Learning
    Sun, Yanpeng
    Chen, Qiang
    Wang, Jian
    Wang, Jingdong
    Li, Zechao
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2025, 34 : 2147 - 2160
  • [7] Rethinking and Improving Visual Prompt Selection for In-Context Learning Segmentation
    Suo, Wei
    Lai, Lanqing
    Sun, Mengyang
    Zhang, Hanwang
    Wang, Peng
    Zhang, Yanning
    COMPUTER VISION-ECCV 2024, PT XLVI, 2025, 15104 : 18 - 35
  • [8] Improving LLM-Based Health Information Extraction with In-Context Learning
    Liu, Junkai
    Wang, Jiayi
    Huang, Hui
    Zhang, Rui
    Yang, Muyun
    Zhao, Tiejun
    HEALTH INFORMATION PROCESSING: EVALUATION TRACK PAPERS, CHIP 2023, 2024, 2080 : 49 - 59
  • [9] Representative Demonstration Selection for In-Context Learning with Two-Stage Determinantal Point Process
    Yang, Zhao
    Zhang, Yuanzhe
    Sui, Dianbo
    Liu, Cao
    Zhao, Jun
    Liu, Kang
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 5443 - 5456
  • [10] Query-focused Submodular Demonstration Selection for In-context Learning in Large Language Models
    Trust, Paul
    Minghim, Rosane
    2023 31ST IRISH CONFERENCE ON ARTIFICIAL INTELLIGENCE AND COGNITIVE SCIENCE, AICS, 2023,