BoardgameQA: A Dataset for Natural Language Reasoning with Contradictory Information

被引:0
|
作者
Kazemi, Mehran [1 ]
Yuan, Quan [1 ]
Bhatia, Deepti [1 ]
Kim, Najoung [1 ]
Xu, Xin [1 ]
Imbrasaite, Vaiva [1 ]
Ramachandran, Deepak [1 ]
机构
[1] Google Res, Mountain View, CA 94043 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Automated reasoning with unstructured natural text is a key requirement for many potential applications of NLP and for developing robust AI systems. Recently, Language Models (LMs) have demonstrated complex reasoning capacities even without any finetuning. However, existing evaluation for automated reasoning assumes access to a consistent and coherent set of information over which models reason. When reasoning in the real-world, the available information is frequently inconsistent or contradictory, and therefore models need to be equipped with a strategy to resolve such conflicts when they arise. One widely-applicable way of resolving conflicts is to impose preferences over information sources (e.g., based on source credibility or information recency) and adopt the source with higher preference. In this paper, we formulate the problem of reasoning with contradictory information guided by preferences over sources as the classical problem of defeasible reasoning, and develop a dataset called BoardgameQA for measuring the reasoning capacity of LMs in this setting. BoardgameQA also incorporates reasoning with implicit background knowledge, to better reflect reasoning problems in downstream applications. We benchmark various LMs on BoardgameQA and the results reveal a significant gap in the reasoning capacity of state-of-the-art LMs on this problem, showing that reasoning with conflicting information does not surface out-of-the-box in LMs. While performance can be improved with finetuning, it nevertheless remains poor.
引用
收藏
页数:23
相关论文
共 50 条
  • [21] Knowledge representation and reasoning in (controlled) natural language
    Fuchs, NE
    CONCEPTUAL STRUCTURES: COMMON SEMANTICS FOR SHARING KNOWLEDGE, PROCEEDINGS, 2005, 3596 : 51 - 51
  • [22] Natural Language Metaphors Covertly Influence Reasoning
    Thibodeau, Paul H.
    Boroditsky, Lera
    PLOS ONE, 2013, 8 (01):
  • [23] Formal Reasoning on Natural Language Descriptions of Processes
    Sanchez-Ferreres, Josep
    Burattin, Andrea
    Carmona, Josep
    Montali, Marco
    Padro, Lluis
    BUSINESS PROCESS MANAGEMENT (BPM 2019), 2019, 11675 : 86 - 101
  • [24] INTERMEDIATE QUANTIFIERS, NATURAL LANGUAGE AND HUMAN REASONING
    Novak, V.
    Murinova, P.
    QUANTITATIVE LOGIC AND SOFT COMPUTING, 2012, 5 : 684 - 692
  • [25] Disentangling Reasoning Factors for Natural Language Inference
    Zhou, Xixi
    Zeng, Limin
    Zhao, Ziping
    Bu, Jiajun
    Liang, Wenjie
    Wang, Haishuai
    BIG DATA MINING AND ANALYTICS, 2025, 8 (03): : 694 - 711
  • [26] Representing and reasoning with events from natural language
    Leith, M
    Cunningham, J
    QUALITATIVE AND QUANTITATIVE PRACTICAL REASONING, 1997, 1244 : 406 - 420
  • [27] Making Natural Language Reasoning Explainable and Faithful
    Du, Xinya
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 20, 2024, : 22664 - 22664
  • [28] Discrete Reasoning Templates for Natural Language Understanding
    Al-Negheimish, Hadeel
    Madhyastha, Pranava
    Russo, Alessandra
    EACL 2021: THE 16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: PROCEEDINGS OF THE STUDENT RESEARCH WORKSHOP, 2021, : 80 - 87
  • [29] Reasoning about inconsistencies in natural language requirements
    Gervasi, V
    Zowghi, D
    ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2005, 14 (03) : 277 - 330
  • [30] Analogical Reasoning for Natural to Formal Language Transfer
    Letard, Vincent
    Rosset, Sophie
    Illouz, Gabriel
    2015 IEEE 27TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2015), 2015, : 210 - 217