An open natural language processing (NLP) framework for EHR-based clinical research: a case demonstration using the National COVID Cohort Collaborative (N3C)

被引:4
|
作者
Liu, Sijia [1 ]
Wen, Andrew [1 ]
Wang, Liwei [1 ]
He, Huan [1 ]
Fu, Sunyang [1 ]
Miller, Robert [2 ]
Williams, Andrew [2 ]
Harris, Daniel [3 ]
Kavuluru, Ramakanth [3 ]
Liu, Mei [4 ]
Abu-el-Rub, Noor [4 ]
Schutte, Dalton [5 ]
Zhang, Rui [5 ]
Rouhizadeh, Masoud [6 ]
Osborne, John D. [7 ]
He, Yongqun [8 ]
Topaloglu, Umit [9 ]
Hong, Stephanie S. [10 ]
Saltz, Joel H. [11 ]
Schaffter, Thomas [12 ]
Pfaff, Emily [13 ]
Chute, Christopher G. [10 ]
Duong, Tim [14 ]
Haendel, Melissa A. [15 ]
Fuentes, Rafael [16 ]
Szolovits, Peter [17 ]
Xu, Hua [18 ]
Liu, Hongfang [1 ,18 ]
机构
[1] Mayo Clin, Dept Artificial Intelligence & Informat, Rochester, MN USA
[2] Tufts Med Ctr, Tufts Clin & Translat Sci Inst, Boston, MA USA
[3] Univ Kentucky, Dept Internal Med, Lexington, KY USA
[4] Univ Kansas, Dept Internal Med, Med Ctr, Kansas City, KS USA
[5] Univ Minnesota Twin Cities, Dept Pharmaceut Care Hlth Syst, Minneapolis, MN USA
[6] Univ Florida, Dept Pharmaceut Outcomes & Policy, Gainesville, FL USA
[7] Univ Alabama Birmingham, Dept Comp Sci, Birmingham, AL USA
[8] Univ Michigan, Dept Comp Med & Bioinformat, Med Sch, Ann Arbor, MI USA
[9] Wake Forest Sch Med, Dept Canc Biol, Winston Salem, NC USA
[10] Johns Hopkins Univ, Dept Med, Baltimore, MD USA
[11] SUNY Stony Brook, Dept Biomed Informat, Stony Brook, NY USA
[12] Sage Bionetwork, Seattle, WA USA
[13] Univ North Carolina Chapel Hill, Dept Med, Chapel Hill, NC USA
[14] Albert Einstein Coll Med, Dept Radiol, Bronx, NY USA
[15] Univ Colorado, Ctr Hlth AI, Anschutz Med Campus, Denver, CO USA
[16] Alex Informat, North Bethesda, MD USA
[17] MIT, Dept Elect Engn & Comp Sci, Cambridge, MA USA
[18] Univ Texas Hlth Sci Ctr Houston, Sch Biomed Informat, Houston, TX USA
基金
美国国家卫生研究院;
关键词
electronic healthy records; natural language processing; federated learning; multi-institutional data annotation;
D O I
10.1093/jamia/ocad134
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Despite recent methodology advancements in clinical natural language processing (NLP), the adoption of clinical NLP models within the translational research community remains hindered by process heterogeneity and human factor variations. Concurrently, these factors also dramatically increase the difficulty in developing NLP models in multi-site settings, which is necessary for algorithm robustness and generalizability. Here, we reported on our experience developing an NLP solution for Coronavirus Disease 2019 (COVID-19) signs and symptom extraction in an open NLP framework from a subset of sites participating in the National COVID Cohort (N3C). We then empirically highlight the benefits of multi-site data for both symbolic and statistical methods, as well as highlight the need for federated annotation and evaluation to resolve several pitfalls encountered in the course of these efforts.
引用
收藏
页码:2036 / 2040
页数:5
相关论文
共 50 条
  • [1] Artificial intelligence approaches using natural language processing to advance EHR-based clinical research
    Juhn, Young
    Liu, Hongfang
    JOURNAL OF ALLERGY AND CLINICAL IMMUNOLOGY, 2020, 145 (02) : 463 - 469
  • [2] The National COVID Cohort Collaborative (N3C): Rationale, design, infrastructure, and deployment
    Haendel, Melissa A.
    Chute, Christopher G.
    Bennett, Tellen D.
    Eichmann, David A.
    Guinney, Justin
    Kibbe, Warren A.
    Payne, Philip R. O.
    Pfaff, Emily R.
    Robinson, Peter N.
    Saltz, Joel H.
    Spratt, Heidi
    Suver, Christine
    Wilbanks, John
    Wilcox, Adam B.
    Williams, Andrew E.
    Wu, Chunlei
    Blacketer, Clair
    Bradford, Robert L.
    Cimino, James J.
    Clark, Marshall
    Colmenares, Evan W.
    Francis, Patricia A.
    Gabriel, Davera
    Graves, Alexis
    Hemadri, Raju
    Hong, Stephanie S.
    Hripscak, George
    Jiao, Dazhi
    Klann, Jeffrey G.
    Kostka, Kristin
    Lee, Adam M.
    Lehmann, Harold P.
    Lingrey, Lora
    Miller, Robert T.
    Morris, Michele
    Murphy, Shawn N.
    Natarajan, Karthik
    Palchuk, Matvey B.
    Sheikh, Usman
    Solbrig, Harold
    Visweswaran, Shyam
    Walden, Anita
    Walters, Kellie M.
    Weber, Griffin M.
    Zhang, Xiaohan Tanner
    Zhu, Richard L.
    Amor, Benjamin
    Girvin, Andrew T.
    Manna, Amin
    Qureshi, Nabeel
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2021, 28 (03) : 427 - 443
  • [3] Insights from an N3C RECOVER EHR-based cohort study characterizing SARS-CoV-2 reinfections and Long COVID
    Hadley, Emily
    Yoo, Yun Jae
    Patel, Saaya
    Zhou, Andrea
    Laraway, Bryan
    Wong, Rachel
    Preiss, Alexander
    Chew, Rob
    Davis, Hannah
    Brannock, M. Daniel
    Chute, Christopher G.
    Pfaff, Emily R.
    Loomba, Johanna
    Haendel, Melissa
    Hill, Elaine
    Moffitt, Richard
    COMMUNICATIONS MEDICINE, 2024, 4 (01):
  • [4] COVID-19 in Liver Transplant Recipients: Results of the National COVID Cohort Collaborative (N3C)
    Kiani, Calvin
    Olex, Amy
    French, Evan
    Gal, Tamas
    Albhaisi, Somaya
    AMERICAN JOURNAL OF GASTROENTEROLOGY, 2021, 116 : S505 - S506
  • [5] CHARACTERISTICS AND OUTCOMES OF 728,047 CHILDREN IN THE NATIONAL COVID COHORT COLLABORATIVE (N3C)
    Martin, Blake
    DeWitt, Peter
    Russell, Seth
    Dziorny, Adam
    Chute, Christopher
    Haendel, Melissa
    Moffitt, Richard
    Bennett, Tellen
    CRITICAL CARE MEDICINE, 2022, 50 (01) : 13 - 13
  • [6] Temporal Patterns in Incidence of AKI Associated with COVID-19 Using the National COVID Cohort Collaborative (N3C) Database
    Koraishy, Farrukh M.
    Sun, Siao
    Potu, Chetan
    Liu, Feifan
    Ellison, David H.
    He, Yongqun O.
    Setoguchi, Soko
    Saran, Rajiv
    Byrd, J. Brian
    Saltz, Joel H.
    Mallipattu, Sandeep K.
    Parikh, Chirag R.
    JOURNAL OF THE AMERICAN SOCIETY OF NEPHROLOGY, 2021, 32 (10): : 64 - 64
  • [7] Outcomes of COVID-19 in cancer patients: Report from the National COVID Cohort Collaborative (N3C).
    Sharafeldin, Noha
    Su, Jing
    Madhira, Vithal
    Song, Qianqian
    Lee, Eileen
    Kuhrt, Nathaniel
    Liu, Feifan
    Bergquist, Timothy
    Guinney, Justin
    Bates, Benjamin
    Topaloglu, Umit
    JOURNAL OF CLINICAL ONCOLOGY, 2021, 39 (15)
  • [8] Covid-19 in Solid Organ Transplantation (SOT): Results of the National Covid Cohort Collaborative (N3C)
    Agarwal, G.
    Vinson, A.
    Dai, R.
    French, E.
    Lee, S.
    Olex, A.
    Anzalone, A.
    Madhira, V.
    Mannon, R. B.
    AMERICAN JOURNAL OF TRANSPLANTATION, 2021, 21 : 354 - 354
  • [9] Outcomes of COVID-19 in Patients With Cancer: Report From the National COVID Cohort Collaborative (N3C)
    Sharafeldin, Noha
    Bates, Benjamin
    Song, Qianqian
    Madhira, Vithal
    Yan, Yao
    Dong, Sharlene
    Lee, Eileen
    Kuhrt, Nathaniel
    Shao, Yu Raymond
    Liu, Feifan
    Bergquist, Timothy
    Guinney, Justin
    Su, Jing
    Topaloglu, Umit
    JOURNAL OF CLINICAL ONCOLOGY, 2021, 39 (20) : 2232 - +
  • [10] The Comparative Effectiveness of Baricitinib and Tocilizumab in Hospitalized Patients With COVID: A Retrospective Cohort Study of The National COVID Cohort Collaborative (N3C)
    Xiao, Xuya
    Patanwala, Asad E.
    Hills, Thomas E.
    Higgins, Alisa M.
    McArthur, Colin J.
    Alexander, G. Caleb
    Mehta, Hemalkumar B.
    PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2024, 33 : 206 - 207