Interpretable Entity Representations through Large-Scale Typing

被引：0

作者：

Onoe, Yasumasa ^{[1
]}

Durrett, Greg ^{[1
]}

机构：

[1] Univ Texas Austin, Dept Comp Sci, Austin, TX 78712 USA

来源：

FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020 | 2020年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In standard methodology for natural language processing, entities in text are typically embedded in dense vector spaces with pre-trained models. The embeddings produced this way are effective when fed into downstream models, but they require end-task fine-tuning and are fundamentally difficult to interpret. In this paper, we present an approach to creating entity representations that are human readable and achieve high performance on entity-related tasks out of the box. Our representations are vectors whose values correspond to posterior probabilities over finegrained entity types, indicating the confidence of a typing model's decision that the entity belongs to the corresponding type. We obtain these representations using a fine-grained entity typing model, trained either on supervised ultra-fine entity typing data (Choi et al., 2018) or distantly-supervised examples from Wikipedia. On entity probing tasks involving recognizing entity identity, our embeddings used in parameter-free downstream models achieve competitive performance with ELMoand BERT-based embeddings in trained models. We also show that it is possible to reduce the size of our type set in a learning-based way for particular domains. Finally, we show that these embeddings can be post-hoc modified through a small number of rules to incorporate domain knowledge and improve performance.

引用

页码：612 / 624

页数：13

共 50 条

[1] Biomedical Interpretable Entity Representations
Garcia-Olano, Diego
Onoe, Yasumasa
Baldini, Ioana
Ghosh, Joydeep
Wallace, Byron C.
Varshney, Kush R.
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 3547 - 3561
[2] Large-Scale Collective Entity Matching
Rastogi, Vibhor
Dalvi, Nilesh
Garofalakis, Minos
PROCEEDINGS OF THE VLDB ENDOWMENT, 2011, 4 (04): : 208 - 218
[3] Entity Relation Mining in Large-Scale Data
Li, Jingnan
Cai, Yi
Wang, Qixuan
Hu, Shuyue
Wang, Tao
Min, Huaqing
DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2015, 2015, 9052 : 109 - 121
[4] Active Learning for Large-Scale Entity Resolution
Qian, Kun
Popa, Lucian
Sen, Prithviraj
CIKM'17: PROCEEDINGS OF THE 2017 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2017, : 1379 - 1388
[5] Effective interpretable learning for large-scale categorical data
Zhang, Yishuo
Zaidi, Nayyar
Zhou, Jiahui
Wang, Tao
Li, Gang
DATA MINING AND KNOWLEDGE DISCOVERY, 2024, 38 (04) : 2223 - 2251
[6] Readable representations for large-scale bipartite graphs
Sato, Shuji
Misue, Kazuo
Tanaka, Jiro
KNOWLEDGE-BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT 2, PROCEEDINGS, 2008, 5178 : 831 - 838
[7] DEVELOPMENT OF CHILDRENS REPRESENTATIONS OF LARGE-SCALE ENVIRONMENTS
HAZEN, NL
LOCKMAN, JJ
PICK, HL
CHILD DEVELOPMENT, 1978, 49 (03) : 623 - 636
[8] An interpretable model for large-scale smart contract vulnerability detection
Feng, Xia
Liu, Haiyang
Wang, Liangmin
Zhu, Huijuan
Sheng, Victor S.
BLOCKCHAIN-RESEARCH AND APPLICATIONS, 2024, 5 (03):
[9] Aleda, a free large-scale entity database for French
Sagot, Benoit
Stern, Rosa
LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 1273 - 1276
[10] Large-Scale Entity Extraction from Enterprise Data
Gupta, Rajeev
Kondapally, Ranganath
SECOND INTERNATIONAL CONFERENCE ON AIML SYSTEMS 2022, 2022,

← 1 2 3 4 5 →