Extracting social determinants of health from electronic health records using natural language processing: a systematic review

被引:101
|
作者
Patra, Braja G. [1 ]
Sharma, Mohit M. [1 ]
Vekaria, Veer [1 ]
Adekkanattu, Prakash [2 ]
Patterson, Olga, V [3 ,4 ]
Glicksberg, Benjamin [5 ]
Lepow, Lauren A. [5 ]
Ryu, Euijung [6 ]
Biernacka, Joanna M. [6 ]
Furmanchuk, Al'ona [7 ]
George, Thomas J. [8 ]
Hogan, William [9 ]
Wu, Yonghui [8 ]
Yang, Xi [8 ]
Bian, Jiang [8 ]
Weissman, Myrna [10 ]
Wickramaratne, Priya [10 ]
Mann, J. John [10 ]
Olfson, Mark [10 ]
Campion, Thomas R., Jr. [1 ,2 ]
Weiner, Mark [1 ]
Pathak, Jyotishman [1 ]
机构
[1] Weill Cornell Med, Dept Populat Hlth Sci, 425 E 61st St,Suite 301, New York, NY 10065 USA
[2] Weill Cornell Med, Informat Technol & Serv, New York, NY 10065 USA
[3] Univ Utah, Dept Internal Med, Div Epidemiol, Salt Lake City, UT 84112 USA
[4] US Dept Vet Affairs, Salt Lake City, UT USA
[5] Icahn Sch Med Mt Sinai, New York, NY 10029 USA
[6] Mayo Clin, Dept Quantitat Hlth Sci, Rochester, MN USA
[7] Northwestern Univ, Chicago, IL 60611 USA
[8] Univ Florida, Dept Hlth Outcomes & Biomed Informat, Gainesville, FL USA
[9] Univ Florida, Coll Med, Dept Med, Div Hematol & Oncol, Gainesville, FL USA
[10] Columbia Univ, Vagelos Coll Phys & Surg, New York, NY USA
关键词
social determinants of health; population health outcomes; electronic health records; natural language processing; information extraction; machine learning; PROBLEM OPIOID USE; BINGE-EATING DISORDER; AUTOMATED IDENTIFICATION; UNSTRUCTURED DATA; CARE; VALIDATION; ABUSE; RISK;
D O I
10.1093/jamia/ocab170
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Objective: Social determinants of health (SDoH) are nonclinical dispositions that impact patient health risks and clinical outcomes. Leveraging SDoH in clinical decision-making can potentially improve diagnosis, treatment planning, and patient outcomes. Despite increased interest in capturing SDoH in electronic health records (EHRs), such information is typically locked in unstructured clinical notes. Natural language processing (NLP) is the key technology to extract SDoH information from clinical text and expand its utility in patient care and research. This article presents a systematic review of the state-of-the-art NLP approaches and tools that focus on identifying and extracting SDoH data from unstructured clinical text in EHRs. Materials and Methods: A broad literature search was conducted in February 2021 using 3 scholarly databases (ACL Anthology, PubMed, and Scopus) following Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. A total of 6402 publications were initially identified, and after applying the study inclusion criteria, 82 publications were selected for the final review. Results: Smoking status (n=27), substance use (n=21), homelessness (n=20), and alcohol use (n=15) are the most frequently studied SDoH categories. Homelessness (n=7) and other less-studied SDoH (eg, education, financial problems, social isolation and support, family problems) are mostly identified using rule-based approaches. In contrast, machine learning approaches are popular for identifying smoking status (n=13), substance use (n=9), and alcohol use (n=9). Conclusion: NLP offers significant potential to extract SDoH data from narrative clinical notes, which in turn can aid in the development of screening tools, risk prediction models, and clinical decision support systems.
引用
收藏
页码:2716 / 2727
页数:12
相关论文
共 50 条
  • [31] Integrating Data On Social Determinants Of Health Into Electronic Health Records
    Cantor, Michael N.
    Thorpe, Lorna
    HEALTH AFFAIRS, 2018, 37 (04) : 585 - 590
  • [32] Social determinants of health: Data standardization in electronic health records
    Cummins, Mollie R.
    Hardiker, Nicholas
    Wang, Jing
    Wilson, Marisa
    Sward, Katherine
    Chernecky, Cynthia
    Roberts, Darryl
    Langford, Laura Heermann
    NURSING OUTLOOK, 2022, 70 (03) : 528 - 534
  • [33] Natural language generation for electronic health records
    Lee, Scott H.
    NPJ DIGITAL MEDICINE, 2018, 1
  • [34] Classification of Severe Maternal Morbidity from Electronic Health Records Written in Spanish Using Natural Language Processing
    Torres-Silva, Ever A.
    Rua, Santiago
    Giraldo-Forero, Andres F.
    Durango, Maria C.
    Florez-Arango, Jose F.
    Orozco-Duque, Andres
    APPLIED SCIENCES-BASEL, 2023, 13 (19):
  • [35] Natural language generation for electronic health records
    Scott H. Lee
    npj Digital Medicine, 1
  • [36] IDENTIFICATION OF PANCREATIC DUCTAL ADENOCARCINOMA RISK FACTORS FROM ELECTRONIC HEALTH RECORDS USING NATURAL LANGUAGE PROCESSING
    Sarwal, Dhruv
    Wang, Liwei
    Gandhi, Sonal
    Sagheb, Elham
    Janssens, Laurens
    Goncalves, Sandy
    Delgado, Adriana
    Doering, Karen
    Liu Hongfang
    Majumder, Shounak
    GASTROENTEROLOGY, 2022, 162 (07) : S243 - S243
  • [37] Incorporating natural language processing to improve classification of axial spondyloarthritis using electronic health records
    Zhao, Sizheng Steven
    Hong, Chuan
    Cai, Tianrun
    Xu, Chang
    Huang, Jie
    Ermann, Joerg
    Goodson, Nicola J.
    Solomon, Daniel H.
    Cai, Tianxi
    Liao, Katherine P.
    RHEUMATOLOGY, 2020, 59 (05) : 1059 - 1065
  • [38] Natural Language Processing to Improve Prediction of Incident Atrial Fibrillation Using Electronic Health Records
    Ashburner, Jeffrey M.
    Chang, Yuchiao
    Wang, Xin
    Khurshid, Shaan
    Anderson, Christopher D.
    Dahal, Kumar
    Weisenfeld, Dana
    Cai, Tianrun
    Liao, Katherine P.
    Wagholikar, Kavishwar B.
    Murphy, Shawn N.
    Atlas, Steven J.
    Lubitz, Steven A.
    Singer, Daniel E.
    JOURNAL OF THE AMERICAN HEART ASSOCIATION, 2022, 11 (15):
  • [39] Identification of recurrent atrial fibrillation using natural language processing applied to electronic health records
    Zheng, Chengyi
    Lee, Ming-sum
    Bansal, Nisha
    Go, Alan S.
    Chen, Cheng
    Harrison, Teresa N.
    Fan, Dongjie
    Allen, Amanda
    Garcia, Elisha
    Lidgard, Ben
    Singer, Daniel
    An, Jaejin
    EUROPEAN HEART JOURNAL-QUALITY OF CARE AND CLINICAL OUTCOMES, 2024, 10 (01) : 77 - 88
  • [40] Using Natural Language Processing on Electronic Health Records to Enhance Detection and Prediction of Psychosis Risk
    Irving, Jessica
    Patel, Rashmi
    Oliver, Dominic
    Colling, Craig
    Pritchard, Megan
    Broadbent, Matthew
    Baldwin, Helen
    Stahl, Daniel
    Stewart, Robert
    Fusar-Poli, Paolo
    SCHIZOPHRENIA BULLETIN, 2021, 47 (02) : 405 - 414