Embedded Word Representations for Rich Indexing: A Case Study for Medical Records

被引:2
|
作者
Metcalf, Katherine [1 ]
Leake, David [1 ]
机构
[1] Indiana Univ, Sch Informat Comp & Engn, Bloomington, IN 47408 USA
来源
CASE-BASED REASONING RESEARCH AND DEVELOPMENT, ICCBR 2018 | 2018年 / 11156卷
关键词
Case-based reasoning for medicine; Electronic health records; Indexing; Textual case-based reasoning; Vector space embedding; TEXT; RETRIEVAL; UMLS;
D O I
10.1007/978-3-030-01081-2_18
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Case indexing decisions must often confront the tradeoff between rich semantic indexing schemes, which provide effective retrieval at large indexing cost, and shallower indexing schemes, which enable lowcost indexing but may be less reliable. Indexing for textual case-based reasoning is often based on information retrieval approaches that minimize index acquisition cost but sacrifice semantic information. This paper presents JointEmbed, a method for automatically generating rich indices. JointEmbed automatically generates continuous vector space embeddings that implicitly capture semantic information, leveraging multiple knowledge sources such as free text cases and pre-existing knowledge graphs. JointEmbed generates effective indices by applying pTransR, a novel approach for modelling knowledge graphs, to encode and summarize contents of domain knowledge resources. JointEmbed is applied to the medical CBR task of retrieving relevant patient electronic health records, for which potential health consequences make retrieval quality paramount. An evaluation supports that JointEmbed outperforms previous methods.
引用
收藏
页码:264 / 280
页数:17
相关论文
共 50 条