Probing for Bridging Inference in Transformer Language Models

被引:0
|
作者
Pandit, Onkar [1 ]
Hou, Yufang [2 ]
机构
[1] Univ Lille, CNRS, Cent Lille, INRIA Lille,UMR 9189,CRIStAL, F-59000 Lille, France
[2] IBM Res Europe, Dublin, Ireland
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We probe pre-trained transformer language models for bridging inference. We first investigate individual attention heads in BERT and observe that attention heads at higher layers prominently focus on bridging relations incomparison with the lower and middle layers, also, few specific attention heads concentrate consistently on bridging. More importantly, we consider language models as a whole in our second approach where bridging anaphora resolution is formulated as a masked token prediction task (Of-Cloze test). Our formulation produces optimistic results without any fine-tuning, which indicates that pre-trained language models substantially capture bridging inference. Our further investigation shows that the distance between anaphor-antecedent and the context provided to language models play an important role in the inference.
引用
收藏
页码:4153 / 4163
页数:11
相关论文
共 50 条
  • [41] TaskLAMA: Probing the Complex Task Understanding of Language Models
    Yuan, Quan
    Kazemi, Mehran
    Xu, Xin
    Noble, Isaac
    Imbrasaite, Vaiva
    Ramachandran, Deepak
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 17, 2024, : 19468 - 19476
  • [42] Probing for Hyperbole in Pre-Trained Language Models
    Schneidermann, Nina Skovgaard
    Hershcovich, Daniel
    Pedersen, Bolette Sandford
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-SRW 2023, VOL 4, 2023, : 200 - 211
  • [43] Probing for Predicate Argument Structures in Pretrained Language Models
    Conia, Simone
    Navigli, Roberto
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 4622 - 4632
  • [44] Probing Pretrained Language Models for Semantic Attributes and their Values
    Beloucif, Meriem
    Biemann, Chris
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 2554 - 2559
  • [45] Propositional Reasoning via Neural Transformer Language Models
    Tomasic, Anthony
    Romero, Oscar J.
    Zimmerman, John
    Steinfeld, Aaron
    NEURAL-SYMBOLIC LEARNING AND REASONING, NESY 2022, 2022, : 104 - 119
  • [46] Can Transformer Language Models Predict Psychometric Properties?
    Laverghetta, Antonio, Jr.
    Nighojkar, Animesh
    Mirzakhalov, Jamshidbek
    Licato, John
    10TH CONFERENCE ON LEXICAL AND COMPUTATIONAL SEMANTICS (SEM 2021), 2021, : 12 - 25
  • [47] Improved Hybrid Streaming ASR with Transformer Language Models
    Baquero-Arnal, Pau
    Jorge, Javier
    Gimenez, Adria
    Albert Silvestre-Cerda, Joan
    Iranzo-Sanchez, Javier
    Sanchis, Albert
    Civera, Jorge
    Juan, Alfons
    INTERSPEECH 2020, 2020, : 2127 - 2131
  • [48] Comparing Symbolic Models of Language via Bayesian Inference
    Heuser, Annika
    Tsvilodub, Polina
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 15799 - 15800
  • [49] Sources of Hallucination by Large Language Models on Inference Tasks
    McKenna, Nick
    Li, Tianyi
    Cheng, Liang
    Hosseini, Mohammad Javad
    Johnson, Mark
    Steedman, Mark
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 2758 - 2774
  • [50] Do Language Models Perform Generalizable Commonsense Inference?
    Wang, Peifeng
    Ilievski, Filip
    Chen, Muhao
    Ren, Xiang
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 3681 - 3688