Dynamic categorization of clinical research eligibility criteria by hierarchical clustering

被引:34
|
作者
Luo, Zhihui [1 ]
Yetisgen-Yildiz, Meliha [2 ]
Weng, Chunhua [1 ]
机构
[1] Columbia Univ, Dept Biomed Informat, New York, NY 10032 USA
[2] Univ Washington, Seattle, WA 98195 USA
关键词
Clinical research eligibility criteria; Classification; Hierarchical clustering; Knowledge representation; Unified Medical Language System (UMLS); Machine learning; Feature representation; CLASSIFICATION;
D O I
10.1016/j.jbi.2011.06.001
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Objective: To semi-automatically induce semantic categories of eligibility criteria from text and to automatically classify eligibility criteria based on their semantic similarity. Design: The UMLS semantic types and a set of previously developed semantic preference rules were utilized to create an unambiguous semantic feature representation to induce eligibility criteria categories through hierarchical clustering and to train supervised classifiers. Measurements: We induced 27 categories and measured the prevalence of the categories in 27,278 eligibility criteria from 1578 clinical trials and compared the classification performance (i.e., precision, recall, and F1-score) between the UMLS-based feature representation and the "bag of words" feature representation among five common classifiers in Weka, including J48, Bayesian Network, Naive Bayesian, Nearest Neighbor, and instance-based learning classifier. Results: The UMLS semantic feature representation outperforms the "bag of words" feature representation in 89% of the criteria categories. Using the semantically induced categories, machine-learning classifiers required only 2000 instances to stabilize classification performance. The J48 classifier yielded the best F1-score and the Bayesian Network classifier achieved the best learning efficiency. Conclusion: The UMLS is an effective knowledge source and can enable an efficient feature representation for semi-automated semantic category induction and automatic categorization for clinical research eligibility criteria and possibly other clinical text. (C) 2011 Elsevier Inc. All rights reserved.
引用
收藏
页码:927 / 935
页数:9
相关论文
共 50 条
  • [21] Textual inference for eligibility criteria resolution in clinical trials
    Shivade, Chaitanya
    Hebert, Courtney
    Lopetegui, Marcelo
    de Marneffe, Marie-Catherine
    Fosler-Lussier, Eric
    Lai, Albert M.
    JOURNAL OF BIOMEDICAL INFORMATICS, 2015, 58 : S211 - S218
  • [22] Do eligibility criteria restrict access to clinical trials?
    Stefaniak, Nessa
    Walker, Jennifer
    Murphy, Monica L.
    McKinney, Mishellene
    Liu, Lu
    Edge, Stephen B.
    JOURNAL OF CLINICAL ONCOLOGY, 2020, 38 (29)
  • [23] Reducing patient eligibility criteria in cancer clinical trials
    George, SL
    JOURNAL OF CLINICAL ONCOLOGY, 1996, 14 (04) : 1364 - 1370
  • [24] Dynamic hierarchical algorithms for document clustering
    Gil-Garcia, Reynaldo
    Pons-Porrata, Aurora
    PATTERN RECOGNITION LETTERS, 2010, 31 (06) : 469 - 477
  • [25] Hierarchical Dynamic Graph Clustering Network
    Chen, Jie
    Jiao, Licheng
    Liu, Xu
    Li, Lingling
    Liu, Fang
    Chen, Puhua
    Yang, Shuyuan
    Hou, Biao
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (09) : 4722 - 4735
  • [26] Dynamic hierarchical compact clustering algorithm
    Gil-García, R
    Badía-Contelles, JM
    Pons-Porrata, A
    PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS AND APPLICATIONS, PROCEEDINGS, 2005, 3773 : 302 - 310
  • [27] A New Terrorism Categorization Based on Casualties and Consequences Using Hierarchical Clustering
    Atsa'am, Donald Douglas
    Wario, Ruth
    Okpo, Fiona Emokpaire
    JOURNAL OF APPLIED SECURITY RESEARCH, 2020, 15 (03) : 369 - 384
  • [28] THE ELIGIBILITY OF WOMEN FOR CLINICAL RESEARCH TRIALS
    PATTERSON, WB
    EMANUEL, EJ
    JOURNAL OF CLINICAL ONCOLOGY, 1995, 13 (01) : 293 - 299
  • [29] Scene Categorization by Hierarchical Clustering on Adaptive Spatio-Temporal Features
    Sunny, Yedakula
    Saha, Pallavi
    Das, Apurba
    2019 FIFTH INTERNATIONAL CONFERENCE ON IMAGE INFORMATION PROCESSING (ICIIP 2019), 2019, : 298 - 303
  • [30] Broadening Eligibility Criteria to Make Clinical Trials More Representative: American Society of Clinical Oncology and Friends of Cancer Research Joint Research Statement
    Kim, Edward S.
    Bruinooge, Suanna S.
    Roberts, Samantha
    Ison, Gwynn
    Lin, Nancy U.
    Gore, Lia
    Uldrick, Thomas S.
    Lichtman, Stuart M.
    Roach, Nancy
    Beaver, Julia A.
    Sridhara, Rajeshwari
    Hesketh, Paul J.
    Denicoff, Andrea M.
    Garrett-Mayer, Elizabeth
    Rubin, Eric
    Multani, Pratik
    Prowell, Tatiana M.
    Schenkel, Caroline
    Kozak, Marina
    Allen, Jeff
    Sigal, Ellen
    Schilsky, Richard L.
    JOURNAL OF CLINICAL ONCOLOGY, 2017, 35 (33) : 3737 - 3744