Interpretable Entity Representations through Large-Scale Typing

被引:0
|
作者
Onoe, Yasumasa [1 ]
Durrett, Greg [1 ]
机构
[1] Univ Texas Austin, Dept Comp Sci, Austin, TX 78712 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In standard methodology for natural language processing, entities in text are typically embedded in dense vector spaces with pre-trained models. The embeddings produced this way are effective when fed into downstream models, but they require end-task fine-tuning and are fundamentally difficult to interpret. In this paper, we present an approach to creating entity representations that are human readable and achieve high performance on entity-related tasks out of the box. Our representations are vectors whose values correspond to posterior probabilities over finegrained entity types, indicating the confidence of a typing model's decision that the entity belongs to the corresponding type. We obtain these representations using a fine-grained entity typing model, trained either on supervised ultra-fine entity typing data (Choi et al., 2018) or distantly-supervised examples from Wikipedia. On entity probing tasks involving recognizing entity identity, our embeddings used in parameter-free downstream models achieve competitive performance with ELMoand BERT-based embeddings in trained models. We also show that it is possible to reduce the size of our type set in a learning-based way for particular domains. Finally, we show that these embeddings can be post-hoc modified through a small number of rules to incorporate domain knowledge and improve performance.
引用
收藏
页码:612 / 624
页数:13
相关论文
共 50 条
  • [1] Biomedical Interpretable Entity Representations
    Garcia-Olano, Diego
    Onoe, Yasumasa
    Baldini, Ioana
    Ghosh, Joydeep
    Wallace, Byron C.
    Varshney, Kush R.
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 3547 - 3561
  • [2] Large-Scale Collective Entity Matching
    Rastogi, Vibhor
    Dalvi, Nilesh
    Garofalakis, Minos
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2011, 4 (04): : 208 - 218
  • [3] Entity Relation Mining in Large-Scale Data
    Li, Jingnan
    Cai, Yi
    Wang, Qixuan
    Hu, Shuyue
    Wang, Tao
    Min, Huaqing
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2015, 2015, 9052 : 109 - 121
  • [4] Active Learning for Large-Scale Entity Resolution
    Qian, Kun
    Popa, Lucian
    Sen, Prithviraj
    CIKM'17: PROCEEDINGS OF THE 2017 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2017, : 1379 - 1388
  • [5] Effective interpretable learning for large-scale categorical data
    Zhang, Yishuo
    Zaidi, Nayyar
    Zhou, Jiahui
    Wang, Tao
    Li, Gang
    DATA MINING AND KNOWLEDGE DISCOVERY, 2024, 38 (04) : 2223 - 2251
  • [6] Readable representations for large-scale bipartite graphs
    Sato, Shuji
    Misue, Kazuo
    Tanaka, Jiro
    KNOWLEDGE-BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT 2, PROCEEDINGS, 2008, 5178 : 831 - 838
  • [7] DEVELOPMENT OF CHILDRENS REPRESENTATIONS OF LARGE-SCALE ENVIRONMENTS
    HAZEN, NL
    LOCKMAN, JJ
    PICK, HL
    CHILD DEVELOPMENT, 1978, 49 (03) : 623 - 636
  • [8] An interpretable model for large-scale smart contract vulnerability detection
    Feng, Xia
    Liu, Haiyang
    Wang, Liangmin
    Zhu, Huijuan
    Sheng, Victor S.
    BLOCKCHAIN-RESEARCH AND APPLICATIONS, 2024, 5 (03):
  • [9] Aleda, a free large-scale entity database for French
    Sagot, Benoit
    Stern, Rosa
    LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 1273 - 1276
  • [10] Large-Scale Entity Extraction from Enterprise Data
    Gupta, Rajeev
    Kondapally, Ranganath
    SECOND INTERNATIONAL CONFERENCE ON AIML SYSTEMS 2022, 2022,