Learning a Health Knowledge Graph from Electronic Medical Records

被引:258
|
作者
Rotmensch, Maya [1 ]
Halpern, Yoni [2 ]
Tlimat, Abdulhakim [3 ]
Horng, Steven [3 ,4 ]
Sontag, David [5 ,6 ]
机构
[1] NYU, Ctr Data Sci, New York, NY USA
[2] NYU, Dept Comp Sci, New York, NY USA
[3] Beth Israel Deaconess Med Ctr, Dept Emergency Med, Boston, MA 02215 USA
[4] Beth Israel Deaconess Med Ctr, Div Clin Informat, Boston, MA 02215 USA
[5] MIT, Dept Elect Engn & Comp Sci, Comp Sci & Artificial Intelligence Lab, 77 Massachusetts Ave, Cambridge, MA 02139 USA
[6] MIT, Inst Med Engn & Sci, 77 Massachusetts Ave, Cambridge, MA 02139 USA
来源
SCIENTIFIC REPORTS | 2017年 / 7卷
关键词
DIAGNOSIS; PROGRAM;
D O I
10.1038/s41598-017-05778-z
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Demand for clinical decision support systems in medicine and self-diagnostic symptom checkers has substantially increased in recent years. Existing platforms rely on knowledge bases manually compiled through a labor-intensive process or automatically derived using simple pairwise statistics. This study explored an automated process to learn high quality knowledge bases linking diseases and symptoms directly from electronic medical records. Medical concepts were extracted from 273,174 de-identified patient records and maximum likelihood estimation of three probabilistic models was used to automatically construct knowledge graphs: logistic regression, naive Bayes classifier and a Bayesian network using noisy OR gates. A graph of disease-symptom relationships was elicited from the learned parameters and the constructed knowledge graphs were evaluated and validated, with permission, against Google's manually-constructed knowledge graph and against expert physician opinions. Our study shows that direct and automated construction of high quality health knowledge graphs from medical records using rudimentary concept extraction is feasible. The noisy OR model produces a high quality knowledge graph reaching precision of 0.85 for a recall of 0.6 in the clinical evaluation. Noisy OR significantly outperforms all tested models across evaluation frameworks (p < 0.01).
引用
收藏
页数:11
相关论文
共 50 条
  • [41] A Graph-Based Method for Analyzing Electronic Medical Records
    Yesha, Rose
    Gangopadhyay, Aryya
    Siegel, Eliot
    PROCEEDINGS OF THE 2015 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING (ASONAM 2015), 2015, : 1036 - 1041
  • [42] Hierarchical Bayesian nonparametric models for knowledge discovery from electronic medical records
    Li, Cheng
    Rana, Santu
    Dinh Phung
    Venkatesh, Svetha
    KNOWLEDGE-BASED SYSTEMS, 2016, 99 : 168 - 182
  • [43] Medical device surveillance with electronic health records
    Callahan, Alison
    Fries, Jason A.
    Re, Christopher
    Huddleston, James I., III
    Giori, Nicholas J.
    Delp, Scott
    Shah, Nigam H.
    NPJ DIGITAL MEDICINE, 2019, 2 (1)
  • [44] Medical device surveillance with electronic health records
    Alison Callahan
    Jason A. Fries
    Christopher Ré
    James I. Huddleston
    Nicholas J. Giori
    Scott Delp
    Nigam H. Shah
    npj Digital Medicine, 2
  • [45] Electronic medical records and health care transformation
    Walker, JM
    HEALTH AFFAIRS, 2005, 24 (05) : 1118 - 1120
  • [46] DOME: Directional medical embedding vectors from Electronic Health Records
    Wen, Jun
    Xue, Hao
    Rush, Everett
    Panickan, Vidul A.
    Cai, Tianrun
    Zhou, Doudou
    Ho, Yuk-Lam
    Costa, Lauren
    Begoli, Edmon
    Hong, Chuan
    Gaziano, J. Michael
    Cho, Kelly
    Liao, Katherine P.
    Lu, Junwei
    Cai, Tianxi
    JOURNAL OF BIOMEDICAL INFORMATICS, 2025, 162
  • [47] Using Electronic Health Records and Machine Learning to Make Medical-Related Predictions from Non-Medical Data
    Pitoglou, Stavros
    Koumpouros, Yiannis
    Anastasiou, Athanasios
    2018 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND DATA ENGINEERING (ICMLDE 2018), 2018, : 56 - 60
  • [48] Federated Learning For Heterogeneous Electronic Health Records Utilising Augmented Temporal Graph Attention Networks
    Molaei, Soheila
    Thakur, Anshul
    Niknam, Ghazaleh
    Soltan, Andrew
    Zare, Hadi
    Clifton, David
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238
  • [49] Graph and text multi-modal representation learning with momentum distillation on Electronic Health Records
    Cao, Yu
    Wang, Xu
    Wang, Qian
    Yuan, Zhong
    Shi, Yongguo
    Peng, Dezhong
    KNOWLEDGE-BASED SYSTEMS, 2024, 302
  • [50] Author Correction: Knowledge graph prediction of unknown adverse drug reactions and validation in electronic health records
    Daniel M. Bean
    Honghan Wu
    Ehtesham Iqbal
    Olubanke Dzahini
    Zina M. Ibrahim
    Matthew Broadbent
    Robert Stewart
    Richard J. B. Dobson
    Scientific Reports, 8