Interpretable Entity Representations through Large-Scale Typing

被引：0

作者：

Onoe, Yasumasa ^{[1
]}

Durrett, Greg ^{[1
]}

机构：

[1] Univ Texas Austin, Dept Comp Sci, Austin, TX 78712 USA

来源：

FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020 | 2020年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In standard methodology for natural language processing, entities in text are typically embedded in dense vector spaces with pre-trained models. The embeddings produced this way are effective when fed into downstream models, but they require end-task fine-tuning and are fundamentally difficult to interpret. In this paper, we present an approach to creating entity representations that are human readable and achieve high performance on entity-related tasks out of the box. Our representations are vectors whose values correspond to posterior probabilities over finegrained entity types, indicating the confidence of a typing model's decision that the entity belongs to the corresponding type. We obtain these representations using a fine-grained entity typing model, trained either on supervised ultra-fine entity typing data (Choi et al., 2018) or distantly-supervised examples from Wikipedia. On entity probing tasks involving recognizing entity identity, our embeddings used in parameter-free downstream models achieve competitive performance with ELMoand BERT-based embeddings in trained models. We also show that it is possible to reduce the size of our type set in a learning-based way for particular domains. Finally, we show that these embeddings can be post-hoc modified through a small number of rules to incorporate domain knowledge and improve performance.

引用

页码：612 / 624

页数：13

共 50 条

[21] LARGE-SCALE EVALUATION OF RAPID AUTOMATIC QUALITATIVE BLOOD TYPING
DUCROS, MJF
BIBLIOTHECA HAEMATOLOGICA, 1968, (29P3): : 947 - &
[22] Towards Large-Scale Interpretable Knowledge Graph Reasoning for Dialogue Systems
Tuan, Yi-Lin
Beygi, Sajjad
Fazel-Zarandi, Maryam
Gao, Qiaozi
Cervone, Alessandra
Wang, William Yang
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 383 - 395
[23] Large-scale neural biomedical entity linking with layer overwriting
Tsujimura, Tomoki
Miwa, Makoto
Sasaki, Yutaka
JOURNAL OF BIOMEDICAL INFORMATICS, 2023, 143
[24] Methodology for Large-Scale Entity Resolution Without Pairwise Matching
Chen, Cheng
Pullen, Daniel
Petty, Reed H.
Talburt, John R.
2015 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOP (ICDMW), 2015, : 204 - 210
[25] OAG: Toward Linking Large-scale Heterogeneous Entity Graphs
Zhang, Fanjin
Liu, Xiao
Tang, Jie
Dong, Yuxiao
Yao, Peiran
Zhang, Jie
Gu, Xiaotao
Wang, Yan
Shao, Bin
Li, Rui
Wang, Kuansan
KDD'19: PROCEEDINGS OF THE 25TH ACM SIGKDD INTERNATIONAL CONFERENCCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2019, : 2585 - 2595
[26] Large-scale Taxonomy Induction Using Entity and Word Embeddings
Ristoski, Petar
Faralli, Stefano
Ponzetto, Simone Paolo
Paulheim, Heiko
2017 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE (WI 2017), 2017, : 81 - 87
[27] Efficient Interactive Training Selection for Large-Scale Entity Resolution
Wang, Qing
Vatsalan, Dinusha
Christen, Peter
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PART II, 2015, 9078 : 562 - 573
[28] Large-scale entity representation learning for biomedical relationship extraction
Saenger, Mario
Leser, Ulf
BIOINFORMATICS, 2021, 37 (02) : 236 - 242
[29] Improved manifold coordinate representations of large-scale hyperspectral scenes
Bachmann, Charles M.
Ainsworth, Thomas L.
Fusina, Robert A.
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2006, 44 (10): : 2786 - 2803
[30] Modeling and representations of large-scale 3D scenes
Zhu, Zhigang
Kanade, Takeo
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2008, 78 (2-3) : 119 - 120

← 1 2 3 4 5 →