Active Learning Principles for In-Context Learning with Large Language Models

被引:0
|
作者
Margatina, Katerina [1 ,2 ]
Schick, Timo [2 ]
Aletras, Nikolaos [1 ]
Dwivedi-Yu, Jane [2 ]
机构
[1] Univ Sheffield, Sheffield, S Yorkshire, England
[2] Meta, FAIR, Menlo Pk, CA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The remarkable advancements in large language models (LLMs) have significantly enhanced predictive performance in few-shot learning settings. By using only a small number of labeled examples, referred to as demonstrations, LLMs can effectively perform the task at hand through in-context learning. However, the process of selecting demonstrations for maximizing performance has received limited attention in prior work. This paper addresses the issue of identifying the most informative demonstrations for few-shot learning by approaching it as a pool-based Active Learning (AL) problem over a single iteration. We compare standard AL algorithms based on uncertainty, diversity, and similarity, and consistently observe that the latter outperforms all other methods, including random sampling. Our extensive experimentation involving a diverse range of GPT and OPT models across 24 classification and multi-choice tasks, coupled with thorough analysis, unambiguously demonstrates the importance of using demonstrations that are semantically similar to the domain of the test examples. In fact, we show higher average classification performance using "similar" demonstrations with GPT-2 (124M) than random demonstrations with GPT-Neox (20B). Notably, while diversity sampling shows promise, uncertainty sampling, despite its success in conventional supervised learning AL scenarios, performs poorly in in-context learning.
引用
收藏
页码:5011 / 5034
页数:24
相关论文
共 50 条
  • [1] Learning to Retrieve In-Context Examples for Large Language Models
    Wang, Liang
    Yang, Nan
    Wei, Furu
    PROCEEDINGS OF THE 18TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 1752 - 1767
  • [2] Adaptive In-Context Learning with Large Language Models for Bundle
    Sun, Zhu
    Feng, Kaidong
    Yang, Jie
    Qu, Xinghua
    Fang, Hui
    Ong, Yew-Soon
    Liu, Wenyuan
    PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024, 2024, : 966 - 976
  • [3] Visual In-Context Learning for Large Vision-Language Models
    Zhou, Yucheng
    Le, Xiang
    Wang, Qianning
    Shen, Jianbing
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 15890 - 15902
  • [4] Are Emergent Abilities in Large Language Models just In-Context Learning?
    Lu, Sheng
    Bigoulaeva, Irina
    Sachdeva, Rachneet
    Madabushi, Harish Tayyar
    Gurevych, Iryna
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 5098 - 5139
  • [5] Steering Large Language Models for Machine Translation with Finetuning and In-Context Learning
    Alves, Duarte M.
    Guerreirol, Nuno M.
    Alves, Joao
    Pombal, Jose
    Rei, Ricardo
    de Souza, Jose G. C.
    Colombo, Pierre
    Martins, Andre F. T.
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 11127 - 11148
  • [6] Symbol tuning improves in-context learning in language models
    Wei, Jerry
    Hou, Le
    Lampinen, Andrew
    Chen, Xiangning
    Huang, Da
    Tay, Yi
    Chen, Xinyun
    Lu, Yifeng
    Zhou, Denny
    Ma, Tengyu
    Le, Quoc V.
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 968 - 979
  • [7] Meta In-Context Learning: Harnessing Large Language Models for Electrical Data Classification
    Zhou, Mi
    Li, Fusheng
    Zhang, Fan
    Zheng, Junhao
    Ma, Qianli
    ENERGIES, 2023, 16 (18)
  • [8] Large Language Models Can be Lazy Learners: Analyze Shortcuts in In-Context Learning
    Tang, Ruixiang
    Kong, Dehan
    Huang, Longtao
    Xue, Hui
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 4645 - 4657
  • [9] Large Language Models Are Latent Variable Models: Explaining and Finding Good Demonstrations for In-Context Learning
    Wang, Xinyi
    Zhu, Wanrong
    Saxon, Michael
    Steyvers, Mark
    Wang, William Yang
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [10] Automatic smart contract comment generation via large language models and in-context learning
    Zhao, Junjie
    Chen, Xiang
    Yang, Guang
    Shen, Yiheng
    INFORMATION AND SOFTWARE TECHNOLOGY, 2024, 168