Language Models are Few-Shot Learners

被引:0
|
作者
Brown, Tom B.
Mann, Benjamin
Ryder, Nick
Subbiah, Melanie
Kaplan, Jared [1 ,2 ]
Dhariwal, Prafulla
Neelakantan, Arvind
Shyam, Pranav
Sastry, Girish
Askell, Amanda
Agarwal, Sandhini
Herbert-Voss, Ariel
Krueger, Gretchen
Henighan, Tom
Child, Rewon
Ramesh, Aditya
Ziegler, Daniel M.
Wu, Jeffrey
Winter, Clemens
Hesse, Christopher
Chen, Mark
Sigler, Eric
Litwin, Mateusz
Gray, Scott
Chess, Benjamin
Clark, Jack
Berner, Christopher
McCandlish, Sam
Radford, Alec
Sutskever, Ilya
Amodei, Dario
机构
[1] Johns Hopkins Univ, Baltimore, MD 21218 USA
[2] OpenAI, San Francisco, CA 94110 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We demonstrate that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even becoming competitive with prior state-of-the-art fine-tuning approaches. Specifically, we train GPT-3, an autoregressive language model with 175 billion parameters, 10x more than any previous non-sparse language model, and test its performance in the few-shot setting. For all tasks, GPT-3 is applied without any gradient updates or fine-tuning, with tasks and few-shot demonstrations specified purely via text interaction with the model. GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks. We also identify some datasets where GPT-3's few-shot learning still struggles, as well as some datasets where GPT-3 faces methodological issues related to training on large web corpora.
引用
收藏
页数:25
相关论文
共 50 条
  • [31] Fairness-guided Few-shot Prompting for Large Language Models
    Ma, Huan
    Zhang, Changqing
    Bian, Yatao
    Liu, Lemao
    Zhang, Zhirui
    Zhao, Peilin
    Zhang, Shu
    Fu, Huazhu
    Hu, Qinghua
    Wu, Bingzhe
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [32] LLaFS: When Large Language Models Meet Few-Shot Segmentation
    Zhu, Lanyun
    Chen, Tianrun
    Ji, Deyi
    Ye, Jieping
    Liu, Jun
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024, 2024, : 3065 - 3075
  • [33] Political Bias of Large Language Models in Few-Shot News Summarization
    Onishi, Takeshi
    Caverlee, James
    ADVANCES IN BIAS AND FAIRNESS IN INFORMATION RETRIEVAL, BIAS 2024, 2025, 2227 : 32 - 45
  • [34] Automated Few-shot Classification with Instruction-Finetuned Language Models
    Aly, Rami
    Shi, Xingjian
    Lin, Kaixiang
    Zhang, Aston
    Wilson, Andrew Gordon
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 2414 - 2432
  • [35] Refactoring Programs Using Large Language Models with Few-Shot Examples
    Shirafuji, Atsushi
    Oda, Yusuke
    Suzuki, Jun
    Morishita, Makoto
    Watanobe, Yutaka
    PROCEEDINGS OF THE 2023 30TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE, APSEC 2023, 2023, : 151 - 160
  • [36] Black Box Few-Shot Adaptation for Vision-Language models
    Ouali, Yassine
    Bulat, Adrian
    Matinez, Brais
    Tzimiropoulos, Georgios
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 15488 - 15500
  • [37] Calibrate Before Use: Improving Few-Shot Performance of Language Models
    Zhao, Tony Z.
    Wallace, Eric
    Feng, Shi
    Klein, Dan
    Singh, Sameer
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [38] TabLLM: Few-shot Classification of Tabular Data with Large Language Models
    Hegselmann, Stefan
    Buendia, Alejandro
    Lang, Hunter
    Agrawal, Monica
    Jiang, Xiaoyi
    Sontag, David
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 206, 2023, 206
  • [39] Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm
    Reynolds, Laria
    McDonell, Kyle
    EXTENDED ABSTRACTS OF THE 2021 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS (CHI'21), 2021,
  • [40] Prompt, Generate, then Cache: Cascade of Foundation Models makes Strong Few-shot Learners
    Zhang, Renrui
    Hu, Xiangfei
    Li, Bohao
    Huang, Siyuan
    Deng, Hanqiu
    Qiao, Yu
    Gao, Peng
    Li, Hongsheng
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 15211 - 15222