Machine learning enabled subgroup analysis with real-world data to inform clinical trial eligibility criteria design

被引:3
|
作者
Xu, Jie [1 ,2 ]
Zhang, Hao [2 ]
Zhang, Hansi [1 ]
Bian, Jiang [1 ]
Wang, Fei [2 ]
机构
[1] Univ Florida, Dept Hlth Outcomes & Biomed Informat, Gainesville, FL 32610 USA
[2] Weill Cornell Med, Dept Populat Hlth Sci, New York, NY 10065 USA
来源
SCIENTIFIC REPORTS | 2023年 / 13卷 / 01期
关键词
RANDOMIZED CONTROLLED-TRIALS; EXTERNAL VALIDITY; GENERALIZABILITY; RECRUITMENT; VARIABLES; DEMENTIA; ERA;
D O I
10.1038/s41598-023-27856-1
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Overly restrictive eligibility criteria for clinical trials may limit the generalizability of the trial results to their target real-world patient populations. We developed a novel machine learning approach using large collections of real-world data (RWD) to better inform clinical trial eligibility criteria design. We extracted patients' clinical events from electronic health records (EHRs), which include demographics, diagnoses, and drugs, and assumed certain compositions of these clinical events within an individual's EHRs can determine the subphenotypes-homogeneous clusters of patients, where patients within each subgroup share similar clinical characteristics. We introduced an outcome-guided probabilistic model to identify those subphenotypes, such that the patients within the same subgroup not only share similar clinical characteristics but also at similar risk levels of encountering severe adverse events (SAEs). We evaluated our algorithm on two previously conducted clinical trials with EHRs from the OneFlorida+ Clinical Research Consortium. Our model can clearly identify the patient subgroups who are more likely to suffer or not suffer from SAEs as subphenotypes in a transparent and interpretable way. Our approach identified a set of clinical topics and derived novel patient representations based on them. Each clinical topic represents a certain clinical event composition pattern learned from the patient EHRs. Tested on both trials, patient subgroup (#SAE=0) and patient subgroup (#SAE>0) can be well-separated by k-means clustering using the inferred topics. The inferred topics characterized as likely to align with the patient subgroup (#SAE>0) revealed meaningful combinations of clinical features and can provide data-driven recommendations for refining the exclusion criteria of clinical trials. The proposed supervised topic modeling approach can infer the clinical topics from the subphenotypes with or without SAEs. The potential rules for describing the patient subgroups with SAEs can be further derived to inform the design of clinical trial eligibility criteria.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] Machine learning and natural language processing in clinical trial eligibility criteria parsing: a scoping review
    Kantor, Klaudia
    Morzy, Mikolaj
    DRUG DISCOVERY TODAY, 2024, 29 (10) : 1 - 8
  • [42] Peak Inspiratory Flow Rate in COPD: An Analysis of Clinical Trial and Real-World Data
    Anderson, Martin
    Collison, Kathryn
    Drummond, M. Bradley
    Hamilton, Melanie
    Jain, Renu
    Martin, Neil
    Mularski, Richard A.
    Thomas, Mike
    Zhu, Chang-Qing
    Ferguson, Gary T.
    INTERNATIONAL JOURNAL OF CHRONIC OBSTRUCTIVE PULMONARY DISEASE, 2021, 16 : 933 - 943
  • [43] 'Considering the totality of evidence: Combining real-world data with clinical trial results to better inform decision-making
    Mamtani, Ronac
    Lund, Jennifer
    Hubbard, Rebecca A.
    PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2021, 30 (06) : 814 - 816
  • [44] Eligibility of real-world patients with metastatic lung cancer for clinical trial participation: A population-based analysis.
    Batra, Atul
    Kong, Shiying
    Rigo, Rodrigo
    Cheung, Winson Y.
    JOURNAL OF CLINICAL ONCOLOGY, 2020, 38 (29)
  • [45] A Precipitation Nowcasting Mechanism for Real-World Data Based on Machine Learning
    Xiang, Yanfei
    Ma, Jianbing
    Wu, Xi
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2020, 2020
  • [46] machine learning applications using real-world data: A literature review
    Adair, Nicholas
    Icten, Zeynep
    Friedman, Mark
    Menzin, Joseph
    PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2020, 29 : 339 - 339
  • [47] Machine Learning and Real-World Data: More than Just Buzzwords
    Bakouny, Ziad
    Patt, Debra A.
    JCO CLINICAL CANCER INFORMATICS, 2021, 5 : 811 - 813
  • [48] Applying Real-World Data to Inform Continuous Glucose Monitoring Use in Clinical Practice
    Zheng, Yaguang
    Siminerio, Linda M.
    Krall, Jodi
    Anton, Bonnie B.
    Hodges, Jacob C.
    Bednarz, Lori
    Li, Dan
    Ng, Jason M.
    JOURNAL OF DIABETES SCIENCE AND TECHNOLOGY, 2021, 15 (04): : 968 - 969
  • [49] USING REAL-WORLD DATA TO INFORM SMARTER CLINICAL TRIALS AND PROSPECTIVE OBSERVATIONAL STUDIES
    Mehta, S.
    Mountford, W. K.
    McDonald, M. R.
    Stolper, R.
    Zakar, J.
    Lobeck, F.
    Christian, J. B.
    Lang, K.
    VALUE IN HEALTH, 2017, 20 (05) : A345 - A345
  • [50] Augmenting Real-World Data Through Modeling Key Clinical Trial Eligibility Criteria: An Example of Patients With Non-small-Cell Lung Cancer Treated With Pembrolizumab
    Jemielita, Thomas
    Li, Xiaoyun
    Burke, Thomas
    Liaw, Kai-Li
    Zhou, Wei
    Chen, Cong
    JCO CLINICAL CANCER INFORMATICS, 2021, 5 : 849 - 858