ZARA: Improving Few-Shot Self-Rationalization for Small Language Models

被引:0
|
作者
Chen, Wei-Lin [1 ]
Yen, An-Zi [2 ]
Wu, Cheng-Kuang [1 ]
Huang, Hen-Hsen [3 ]
Chen, Hsin-Hsi [1 ]
机构
[1] Natl Taiwan Univ, Taipei, Taiwan
[2] Natl Yang Ming Chiao Tung Univ, Taipei, Taiwan
[3] Acad Sinica, Taipei, Taiwan
关键词
ERROR;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Language models (LMs) that jointly generate end-task answers as well as free-text rationales are known as self-rationalization models. Recent works demonstrate great performance gain for self-rationalization by few-shot prompting LMs with rationale-augmented exemplars. However, the ability to benefit from explanations only emerges with large-scale LMs, which have poor accessibility. In this work, we explore the less-studied setting of leveraging explanations for small LMs to improve few-shot self-rationalization. We first revisit the relationship between rationales and answers. Inspired by the implicit mental process of how human beings assess explanations, we present a novel approach, Zero-shot Augmentation of Rationale-Answer pairs (ZARA), to automatically construct pseudo-parallel data for self-training by reducing the problem of plausibility judgement to natural language inference. Experimental results show ZARA achieves SOTA performance on the FEB benchmark, for both the task accuracy and the explanation metric. In addition, we conduct human and quantitative evaluation validating ZARA's ability to automatically identify plausible and accurate rationale-answer pairs.(1)
引用
收藏
页码:4682 / 4693
页数:12
相关论文
共 50 条
  • [41] A Closer Look at the Few-Shot Adaptation of Large Vision-Language Models
    Iguez, Julio Silva-Rodr
    Hajimiri, Sina
    Ben Ayed, Ismail
    Dolz, Jose
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 23681 - 23690
  • [42] Unsupervised and Few-Shot Parsing from Pretrained Language Models (Extended Abstract)
    Zeng, Zhiyuan
    Xiong, Deyi
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 6995 - 7000
  • [43] A few-shot learning method based on knowledge graph in large language models
    Wang, Feilong
    Shi, Donghui
    Aguilar, Jose
    Cui, Xinyi
    INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2024,
  • [44] Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with Language Models
    Logan, Robert L.
    Balazevic, Ivana
    Wallace, Eric
    Petroni, Fabio
    Singh, Sameer
    Riedel, Sebastian
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 2824 - 2835
  • [45] Knowledge-Grounded Self-Rationalization via Extractive and Natural Language Explanations
    Majumder, Bodhisattwa Prasad
    Camburu, Oana-Maria
    Lukasiewicz, Thomas
    McAuley, Julian
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [46] Few-Shot Panoptic Segmentation With Foundation Models
    Kaeppeler, Markus
    Petek, Kursat
    Voedisch, Niclas
    Burgar, Wolfram
    Valada, Abhinav
    2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2024, 2024, : 7718 - 7724
  • [48] Harnessing large language models' zero-shot and few-shot learning capabilities for regulatory research
    Meshkin, Hamed
    Zirkle, Joel
    Arabidarrehdor, Ghazal
    Chaturbedi, Anik
    Chakravartula, Shilpa
    Mann, John
    Thrasher, Bradlee
    Li, Zhihua
    BRIEFINGS IN BIOINFORMATICS, 2024, 25 (05)
  • [49] Inference Calibration of Vision-Language Foundation Models for Zero-Shot and Few-Shot Learning
    Hu, Minyang
    Chang, Hong
    Shan, Shiguang
    Chen, Xilin
    PATTERN RECOGNITION LETTERS, 2025, 192 : 15 - 21
  • [50] The COT COLLECTION: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning
    Kim, Seungone
    Jool, Se June
    Kim, Doyoung
    Jang, Joel
    Ye, Seonghyeon
    Shin, Jamin
    Seo, Minjoon
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 12685 - 12708