Evaluating BERT for natural language inference: A case study on the CommitmentBank

被引:0
|
作者
Jiang, Nanjiang [1 ]
de Marneffe, Marie-Catherine [1 ]
机构
[1] Ohio State Univ, Dept Linguist, Columbus, OH 43210 USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Natural language inference (NLI) datasets (e.g., MultiNLI) were collected by soliciting hypotheses for a given premise from annotators. Such data collection led to annotation artifacts: systems can identify the premise-hypothesis relationship without observing the premise (e.g., negation in hypothesis being indicative of contradiction). We address this problem by recasting the CommitmentBank for NLI, which contains items involving reasoning over the extent to which a speaker is committed to complements of clause-embedding verbs under entailment-canceling environments (conditional, negation, modal and question). Instead of being constructed to stand in certain relationships with the premise, hypotheses in the recast CommitmentBank are the complements of the clause-embedding verb in each premise, leading to no annotation artifacts in the hypothesis. A state-of-the-art BERT-based model performs well on the CommitmentBank with 85% F1. However analysis of model behavior shows that the BERT models still do not capture the full complexity of pragmatic reasoning, nor encode some of the linguistic generalizations, highlighting room for improvement.
引用
收藏
页码:6086 / 6091
页数:6
相关论文
共 50 条
  • [21] Evaluating Gender Bias of Pre-trained Language Models in Natural Language Inference by Considering All Labels
    Anantaprayoon, Panatchakorn
    Kaneko, Masahiro
    Okazaki, Naoaki
    arXiv, 2023,
  • [22] Knowledge Augmented Inference Network for Natural Language Inference
    Jiang, Shan
    Li, Bohan
    Liu, Chunhua
    Yu, Dong
    KNOWLEDGE GRAPH AND SEMANTIC COMPUTING: KNOWLEDGE COMPUTING AND LANGUAGE UNDERSTANDING (CCKS 2018), 2019, 957 : 129 - 135
  • [23] Natural language multiprocessing: A case study
    Pontelli, E
    Gupta, G
    Wiebe, J
    Farwell, D
    FIFTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-98) AND TENTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICAL INTELLIGENCE (IAAI-98) - PROCEEDINGS, 1998, : 76 - 82
  • [24] e-SNLI: Natural Language Inference with Natural Language Explanations
    Camburu, Oana-Maria
    Rocktaschel, Tim
    Lukasiewicz, Thomas
    Blunsom, Phil
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [25] Explaining Simple Natural Language Inference
    Kalouli, Aikaterini-Lida
    Buis, Annebeth
    Real, Livy
    Palmer, Martha
    de Paiva, Valeria
    13TH LINGUISTIC ANNOTATION WORKSHOP (LAW XIII), 2019, : 132 - 143
  • [26] INFERENCE AND COMPUTER UNDERSTANDING OF NATURAL LANGUAGE
    SCHANK, RC
    RIEGER, CJ
    ARTIFICIAL INTELLIGENCE, 1974, 5 (04) : 373 - 412
  • [27] Temporal Reasoning in Natural Language Inference
    Vashishtha, Siddharth
    Poliak, Adam
    Lal, Yash Kumar
    Van Durme, Benjamin
    White, Aaron Steven
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 4070 - 4078
  • [28] Enhanced LSTM for Natural Language Inference
    Chen, Qian
    Zhu, Xiaodan
    Ling, Zhenhua
    Wei, Si
    Jiang, Hui
    Inkpen, Diana
    PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1, 2017, : 1657 - 1668
  • [29] Culturally Aware Natural Language Inference
    Huang, Jing
    Yang, Diyi
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 7591 - 7609
  • [30] Automated Scoring of Translations with BERT Models: Chinese and English Language Case Study
    Cui, Yizhuo
    Liang, Maocheng
    APPLIED SCIENCES-BASEL, 2024, 14 (05):