Automated Assessment of Fidelity and Interpretability: An Evaluation Framework for Large Language Models' Explanations (Student Abstract)

被引：0

作者：

Kuo, Mu-Tien ^{[1
,2
]}

Hsueh, Chih-Chung ^{[1
,2
]}

Tsai, Richard Tzong-Han ^{[2
,3
]}

机构：

[1] Chingshin Acad, Taipei, Taiwan

[2] Acad Sinica, Res Ctr Humanities & Social Sci, Taipei, Taiwan

[3] Natl Cent Univ, Dept Comp Sci & Engn, Taoyuan, Taiwan

来源：

THIRTY-EIGTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 21 | 2024年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

As Large Language Models (LLMs) become more prevalent in various fields, it is crucial to rigorously assess the quality of their explanations. Our research introduces a task-agnostic framework for evaluating free-text rationales, drawing on insights from both linguistics and machine learning. We evaluate two dimensions of explainability: fidelity and interpretability. For fidelity, we propose methods suitable for proprietary LLMs where direct introspection of internal features is unattainable. For interpretability, we use language models instead of human evaluators, addressing concerns about subjectivity and scalability in evaluations. We apply our framework to evaluate GPT-3.5 and the impact of prompts on the quality of its explanations. In conclusion, our framework streamlines the evaluation of explanations from LLMs, promoting the development of safer models.

引用

页码：23554 / 23555

页数：2

共 50 条

[21] Large Language Models for Automated Program Repair
Ribeiro, Francisco
COMPANION PROCEEDINGS OF THE 2023 ACM SIGPLAN INTERNATIONAL CONFERENCE ON SYSTEMS, PROGRAMMING, LANGUAGES, AND APPLICATIONS: SOFTWARE FOR HUMANITY, SPLASH COMPANION 2023, 2023, : 7 - 9
[22] Large Language Models for Automated Program Repair
Ribeiro, Francisco
SPLASH Companion 2023 - Companion Proceedings of the 2023 ACM SIGPLAN International Conference on Systems, Programming, Languages, and Applications: Software for Humanity, 2023, : 7 - 9
[23] Automated Topic Analysis with Large Language Models
Kirilenko, Andrei
Stepchenkova, Svetlana
INFORMATION AND COMMUNICATION TECHNOLOGIES IN TOURISM 2024, ENTER 2024, 2024, : 29 - 34
[24] An evaluation framework for clinical use of large language models in patient interaction tasks
Johri, Shreya
Jeong, Jaehwan
Tran, Benjamin A.
Schlessinger, Daniel I.
Wongvibulsin, Shannon
Barnes, Leandra A.
Zhou, Hong-Yu
Cai, Zhuo Ran
Van Allen, Eliezer M.
Kim, David
Daneshjou, Roxana
Rajpurkar, Pranav
NATURE MEDICINE, 2025, 31 (01) : 77 - 86
[25] Automated Scoring of Constructed Response Items in Math Assessment Using Large Language Models
Morris, Wesley
Holmes, Langdon
Choi, Joon Suh
Crossley, Scott
INTERNATIONAL JOURNAL OF ARTIFICIAL INTELLIGENCE IN EDUCATION, 2024,
[26] Enhancing the Interpretability of Malaria and Typhoid Diagnosis with Explainable AI and Large Language Models
Attai, Kingsley
Ekpenyong, Moses
Amannah, Constance
Asuquo, Daniel
Ajuga, Peterben
Obot, Okure
Johnson, Ekemini
John, Anietie
Maduka, Omosivie
Akwaowo, Christie
Uzoka, Faith-Michael
TROPICAL MEDICINE AND INFECTIOUS DISEASE, 2024, 9 (09)
[27] KIEval: A Knowledge-grounded Interactive Evaluation Framework for Large Language Models
Yu, Zhuohao
Gao, Chang
Yao, Wenjin
Wang, Yidong
Ye, Wei
Wang, Jindong
Xie, Xing
Zhang, Yue
Zhang, Shikun
PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 5967 - 5985
[28] EvaAI: A Multi-agent Framework Leveraging Large Language Models for Enhanced Automated Grading
Lagakis, Paraskevas
Demetriadis, Stavros
GENERATIVE INTELLIGENCE AND INTELLIGENT TUTORING SYSTEMS, PT I, ITS 2024, 2024, 14798 : 378 - 385
[29] Student self-assessment of oral explanations: Use of language learning progressions
Goral, Despina P.
Bailey, Alison L.
LANGUAGE TESTING, 2019, 36 (03) : 391 - 417
[30] ConSiDERS-The-Human Evaluation Framework: Rethinking Human Evaluation for Generative Large Language Models
Elangovan, Aparna
Liu, Ling
Xu, Lei
Bodapati, Sravan
Roth, Dan
PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 1137 - 1160

← 1 2 3 4 5 →