Rule or Story, Which is a Better Commonsense Expression for Talking with Large Language Models?

被引：0

作者：

Bian, Ning ^{[1
,2
]}

Han, Xianpei ^{[1
,2
]}

Lin, Hongyu ^{[2
]}

Lu, Yaojie ^{[2
]}

He, Ben ^{[1
,2
]}

Sun, Le ^{[1
,2
]}

机构：

[1] Univ Chinese Acad Sci, Beijing, Peoples R China

[2] Chinese Acad Sci, Inst Software, Beijing, Peoples R China

来源：

PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS | 2024年

基金：

中国国家自然科学基金;

关键词：

GRAPH; DATASET; MEMORY; ATLAS;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Building machines with commonsense has been a longstanding challenge in NLP due to the reporting bias of commonsense rules and the exposure bias of rule-based commonsense reasoning. In contrast, humans convey and pass down commonsense implicitly through stories. This paper investigates the inherent commonsense ability of large language models (LLMs) expressed through storytelling. We systematically investigate and compare stories and rules for retrieving and leveraging commonsense in LLMs. Experimental results on 28 commonsense QA datasets show that stories outperform rules as the expression for retrieving commonsense from LLMs, exhibiting higher generation confidence and commonsense accuracy. Moreover, stories are the more effective commonsense expression for answering questions regarding daily events, while rules are more effective for scientific questions. This aligns with the reporting bias of commonsense in text corpora. We further show that the correctness and relevance of commonsense stories can be further improved via iterative self-supervised fine-tuning. These findings emphasize the importance of using appropriate language to express, retrieve, and leverage commonsense for LLMs, highlighting a promising direction for better exploiting their commonsense abilities.

引用

页码：4023 / 4043

页数：21

共 50 条

[1] Talking about Large Language Models
Shanahan, Murray
COMMUNICATIONS OF THE ACM, 2024, 67 (02) : 68 - 79
[2] Large Language Models as Commonsense Knowledge for Large-Scale Task Planning
Zhao, Zirui
Lee, Wee Sun
Hsu, David
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[3] ViCor: Bridging Visual Understanding and Commonsense Reasoning with Large Language Models
Zhou, Kaiwen
Lee, Kwonjoon
Misu, Teruhisa
Wang, Xin Eric
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 10783 - 10795
[4] Commonsense Reasoning and Explainable Artificial Intelligence Using Large Language Models
Krause, Stefanie
Stolzenburg, Frieder
ARTIFICIAL INTELLIGENCE-ECAI 2023 INTERNATIONAL WORKSHOPS, PT 1, XAI3, TACTIFUL, XI-ML, SEDAMI, RAAIT, AI4S, HYDRA, AI4AI, 2023, 2024, 1947 : 302 - 319
[5] Wordcraft Story Writing With Large Language Models
Yuan, Ann
Coenen, Andy
Reif, Emily
Ippolito, Daphne
IUI'22: 27TH INTERNATIONAL CONFERENCE ON INTELLIGENT USER INTERFACES, 2022, : 841 - 852
[6] Context-Specific Selection of Commonsense Knowledge Using Large Language Models
Jakobs, Oliver
Schon, Claudia
KI 2024: ADVANCES IN ARTIFICIAL INTELLIGENCE, KI 2024, 2024, 14992 : 218 - 231
[7] Which Mapping Rule in the Fireworks Algorithm is Better for Large Scale Optimization
Ye, Xuemei
Li, Junzhi
Xu, Bo
Tan, Ying
2018 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2018, : 2195 - 2202
[8] Making Large Language Models Better Data Creators
Lee, Dong-Ho
Pujara, Jay
Sewak, Mohit
White, Ryen W.
Jauhar, Sujay Kumar
2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 15349 - 15360
[9] Do Language Models Have a Common Sense regarding Time? Revisiting Temporal Commonsense Reasoning in the Era of Large Language Models
Jain, Raghav
Sojitra, Daivik
Acharya, Arkadeep
Saha, Sriparna
Jatowt, Adam
Dandapat, Sandipan
2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 6750 - 6774
[10] Leveraging large language models: transforming scholarly publishing for the better
Fortier, Lisa A.
JAVMA-JOURNAL OF THE AMERICAN VETERINARY MEDICAL ASSOCIATION, 2023, 261 (08): : 1106 - 1107

← 1 2 3 4 5 →