Evaluating BERT for natural language inference: A case study on the CommitmentBank

被引:0
|
作者
Jiang, Nanjiang [1 ]
de Marneffe, Marie-Catherine [1 ]
机构
[1] Ohio State Univ, Dept Linguist, Columbus, OH 43210 USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Natural language inference (NLI) datasets (e.g., MultiNLI) were collected by soliciting hypotheses for a given premise from annotators. Such data collection led to annotation artifacts: systems can identify the premise-hypothesis relationship without observing the premise (e.g., negation in hypothesis being indicative of contradiction). We address this problem by recasting the CommitmentBank for NLI, which contains items involving reasoning over the extent to which a speaker is committed to complements of clause-embedding verbs under entailment-canceling environments (conditional, negation, modal and question). Instead of being constructed to stand in certain relationships with the premise, hypotheses in the recast CommitmentBank are the complements of the clause-embedding verb in each premise, leading to no annotation artifacts in the hypothesis. A state-of-the-art BERT-based model performs well on the CommitmentBank with 85% F1. However analysis of model behavior shows that the BERT models still do not capture the full complexity of pragmatic reasoning, nor encode some of the linguistic generalizations, highlighting room for improvement.
引用
收藏
页码:6086 / 6091
页数:6
相关论文
共 50 条
  • [1] ExBERT: An External Knowledge Enhanced BERT for Natural Language Inference
    Gajbhiye, Amit
    Al Moubayed, Noura
    Bradley, Steven
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2021, PT V, 2021, 12895 : 460 - 472
  • [2] Natural Language Inference for Portuguese Using BERT and Multilingual Information
    Sobrevilla Cabezudo, Marco Antonio
    Inacio, Marcio
    Rodrigues, Ana Carolina
    Casanova, Edresson
    de Sousa, Rogerio Figueredo
    COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANGUAGE, PROPOR 2020, 2020, 12037 : 346 - 356
  • [3] Performance Evaluation of BERT Vectors on Natural Language Inference Models
    Ogul, Iskender Ulgen
    Tekir, Selma
    29TH IEEE CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS (SIU 2021), 2021,
  • [4] How Fast can BERT Learn Simple Natural Language Inference?
    Lin, Yi-Chung
    Su, Keh-Yih
    16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 626 - 633
  • [5] Evaluating Deep Learning Techniques for Natural Language Inference
    Eleftheriadis, Petros
    Perikos, Isidoros
    Hatzilygeroudis, Ioannis
    APPLIED SCIENCES-BASEL, 2023, 13 (04):
  • [6] XINFOTABS: Evaluating Multilingual Tabular Natural Language Inference
    Minhas, Bhavnick
    Shankhdhar, Anant
    Gupta, Vivek
    Aggrawal, Divyanshu
    Zhang, Shuo
    PROCEEDINGS OF THE FIFTH FACT EXTRACTION AND VERIFICATION WORKSHOP (FEVER 2022), 2022, : 59 - 77
  • [7] Evaluating Natural Language Inference Models: A Metamorphic Testing Approach
    Jiang, Mingyue
    Bao, Houzhen
    Tu, Kaiyi
    Zhang, Xiao-Yi
    Ding, Zuohua
    2021 IEEE 32ND INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING (ISSRE 2021), 2021, : 220 - 230
  • [8] LogicAttack: Adversarial Attacks for Evaluating Logical Consistency of Natural Language Inference
    Nakamura, Mutsumi
    Mashetty, Santosh
    Parmar, Mihir
    Varshney, Neeraj
    Baral, Chitta
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 13322 - 13334
  • [9] SherLIiC: A Typed Event-Focused Lexical Inference Benchmark for Evaluating Natural Language Inference
    Schmitt, Martin
    Schuetze, Hinrich
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 902 - 914
  • [10] TinyBERT: Distilling BERT for Natural Language Understanding
    Jiao, Xiaoqi
    Yin, Yichun
    Shang, Lifeng
    Jiang, Xin
    Chen, Xiao
    Li, Linlin
    Wang, Fang
    Liu, Qun
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 4163 - 4174