Adaptation with Self-Evaluation to Improve Selective Prediction in LLMs

被引:0
|
作者
Chen, Jiefeng [1 ,3 ,4 ]
Yoon, Jinsung [2 ]
Ebrahimi, Sayna [2 ]
Arik, Sercan O. [2 ]
Pfister, Tomas [2 ]
Jha, Somesh [1 ,2 ]
机构
[1] Univ Wisconsin Madison, Madison, WI 53706 USA
[2] Google LLC, Mountain View, CA USA
[3] Google, Mountain View, CA USA
[4] Amazon, Seattle, WA USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large language models (LLMs) have recently shown great advances in a variety of tasks, including natural language understanding and generation. However, their use in high-stakes decision-making scenarios is still limited due to the potential for errors. Selective prediction is a technique that can be used to improve the reliability of the LLMs by allowing them to abstain from making predictions when they are unsure of the answer. In this work, we propose a novel framework for adaptation with self-evaluation to improve the selective prediction performance of LLMs. Our framework is based on the idea of using parameter-efficient tuning to adapt the LLM to the specific task at hand while improving its ability to perform self-evaluation. We evaluate our method on a variety of question-answering (QA) datasets and show that it outperforms state-of-the-art selective prediction methods. For example, on the CoQA benchmark, our method improves the AUACC from 91.23% to 92.63% and improves the AUROC from 74.61% to 80.25%.
引用
收藏
页码:5190 / 5213
页数:24
相关论文
共 50 条
  • [31] Pupil Evaluation and Self-Evaluation
    Symonds, Percival M.
    TEACHERS COLLEGE RECORD, 1952, 54 (03): : 138 - 149
  • [32] Learning by doing: Do economics students self-evaluation skills improve?
    Guest, Jon
    Riegler, Robert
    INTERNATIONAL REVIEW OF ECONOMICS EDUCATION, 2017, 24 : 50 - 64
  • [33] Self-evaluation of Educational Centers. How to improve from the inside
    Gonzalez Gonzalez, Maria Teresa
    EDUCATIO SIGLO XXI, 2014, 32 (01): : 281 - 284
  • [34] SELF-EVALUATION AND CREATIVITY
    SZYMANSKI, K
    HARKINS, SG
    PERSONALITY AND SOCIAL PSYCHOLOGY BULLETIN, 1992, 18 (03) : 259 - 265
  • [35] Tools for self-evaluation
    不详
    TRANSFUSION CLINIQUE ET BIOLOGIQUE, 1999, 6 (05) : 311 - 323
  • [36] SELF-EVALUATION PROCESSES
    TAYLOR, SE
    NETER, E
    WAYMENT, HA
    PERSONALITY AND SOCIAL PSYCHOLOGY BULLETIN, 1995, 21 (12) : 1278 - 1287
  • [37] CLINIC SELF-EVALUATION
    SHAW, DR
    PHILLIPS, SK
    DANIEL, WA
    INDUSTRIAL ENGINEERING, 1976, 8 (06): : 18 - 24
  • [38] SELF-EVALUATION AND THE TEACHER
    ELLIOTT, G
    JOURNAL OF CURRICULUM STUDIES, 1982, 14 (01) : 89 - 90
  • [39] Pensioners' self-evaluation
    Kozlova, TZ
    SOTSIOLOGICHESKIE ISSLEDOVANIYA, 2003, (04): : 58 - 63
  • [40] SELF-EVALUATION IN CME
    BURNETT, M
    JAMIESON, MJ
    VALENTINE, A
    CANADIAN FAMILY PHYSICIAN, 1985, 31 (MAR) : 459 - 459