BoardgameQA: A Dataset for Natural Language Reasoning with Contradictory Information

被引：0

作者：

Kazemi, Mehran ^{[1
]}

Yuan, Quan ^{[1
]}

Bhatia, Deepti ^{[1
]}

Kim, Najoung ^{[1
]}

Xu, Xin ^{[1
]}

Imbrasaite, Vaiva ^{[1
]}

Ramachandran, Deepak ^{[1
]}

机构：

[1] Google Res, Mountain View, CA 94043 USA

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023) | 2023年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Automated reasoning with unstructured natural text is a key requirement for many potential applications of NLP and for developing robust AI systems. Recently, Language Models (LMs) have demonstrated complex reasoning capacities even without any finetuning. However, existing evaluation for automated reasoning assumes access to a consistent and coherent set of information over which models reason. When reasoning in the real-world, the available information is frequently inconsistent or contradictory, and therefore models need to be equipped with a strategy to resolve such conflicts when they arise. One widely-applicable way of resolving conflicts is to impose preferences over information sources (e.g., based on source credibility or information recency) and adopt the source with higher preference. In this paper, we formulate the problem of reasoning with contradictory information guided by preferences over sources as the classical problem of defeasible reasoning, and develop a dataset called BoardgameQA for measuring the reasoning capacity of LMs in this setting. BoardgameQA also incorporates reasoning with implicit background knowledge, to better reflect reasoning problems in downstream applications. We benchmark various LMs on BoardgameQA and the results reveal a significant gap in the reasoning capacity of state-of-the-art LMs on this problem, showing that reasoning with conflicting information does not surface out-of-the-box in LMs. While performance can be improved with finetuning, it nevertheless remains poor.

引用

页数：23

共 50 条

[1] CLEVR-X: A Visual Reasoning Dataset for Natural Language Explanations
Salewski, Leonard
Koepke, A. Sophia
Lensch, Hendrik P. A.
Akata, Zeynep
XXAI - BEYOND EXPLAINABLE AI: International Workshop, Held in Conjunction with ICML 2020, July 18, 2020, Vienna, Austria, Revised and Extended Papers, 2022, 13200 : 69 - 88
[2] LogiQA 2.0-An Improved Dataset for Logical Reasoning in Natural Language Understanding
Liu, Hanmeng
Liu, Jian
Cui, Leyang
Teng, Zhiyang
Duan, Nan
Zhou, Ming
Zhang, Yue
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 2947 - 2962
[3] Introduction to special issue on reasoning in natural language information processing
Song, Dawei
Nie, Jian-Yun
ACM Transactions on Asian Language Information Processing, 2006, 5 (04): : 291 - 295
[4] Making use of contradictory behavior information in qualitative reasoning
Department of Computer Engineering, Booaziçi University, Bebek 80815, Istanbul, Turkey
IEEE Trans Pattern Anal Mach Intell, 8 (781-786):
[5] Making use of contradictory behavior information in qualitative reasoning
Say, ACC
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1999, 21 (08) : 781 - 786
[6] Reasoning with contradictory information using quasi-classical logic
Hunter, A
JOURNAL OF LOGIC AND COMPUTATION, 2000, 10 (05) : 677 - 703
[7] Natural Language Reasoning, A Survey
Yu, Fei
Zhang, Hongbo
Tiwari, Prayag
Wang, Benyou
ACM COMPUTING SURVEYS, 2024, 56 (12)
[8] Probabilistic reasoning and natural language
Macchi, Laura
Bagassi, Maria
BIOLOGICAL AND CULTURAL BASES OF HUMAN INFERENCE, 2006, : 223 - 239
[9] Influence of Natural Language on Reasoning
Skelac, Ines
Smokrovic, Nenad
FILOZOFSKA ISTRAZIVANJA, 2017, 37 (04): : 709 - 722
[10] Automated reasoning with merged contradictory information whose reliability depends on topics
Cholvy, L
SYMBOLIC AND QUANTITATIVE APPROACHES TO REASONING AND UNCERTAINTY, 1995, 946 : 125 - 132

← 1 2 3 4 5 →