The GAAIN Entity Mapper: An Active-Learning System for Medical Data Mapping

被引:2
|
作者
Ashish, Naveen [1 ]
Dewan, Peehoo [1 ]
Toga, Arthur W. [1 ]
机构
[1] Univ So Calif, Keck Sch Med, Lab Neuro Imaging, Stevens Neuroimaging & Informat Inst, Los Angeles, CA 90033 USA
来源
基金
美国国家卫生研究院;
关键词
data mapping; machine learning; active Learning; data harmonization; common data model; UNIFORM DATA SET;
D O I
10.3389/fninf.2015.00030
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
This work is focused on mapping biomedical datasets to a common representation, as an integral part of data harmonization for integrated biomedical data access and sharing. We present GEM, an intelligent software assistant for automated data mapping across different datasets or from a dataset to a common data model. The GEM system automates data mapping by providing precise suggestions for data element mappings. It leverages the detailed metadata about elements in associated dataset documentation such as data dictionaries that are typically available with biomedical datasets. It employs unsupervised text mining techniques to determine similarity between data elements and also employs machine-learning classifiers to identify element matches. It further provides an active-learning capability where the process of training the GEM system is optimized. Our experimental evaluations show that the GEM system provides highly accurate data mappings (over 90% accuracy) for real datasets of thousands of data elements each, in the Alzheimer's disease research domain. Further, the effort in training the system for new datasets is also optimized. We are currently employing the GEM system to map Alzheimer's disease datasets from around the globe into a common representation, as part of a global Alzheimer's disease integrated data sharing and analysis network called GAAIN(1) GEM achieves significantly higher data mapping accuracy for biomedical datasets compared to other state-of-the-art tools for database schema matching that have similar functionality. With the use of active-learning capabilities, the user effort in training the system is minimal.
引用
收藏
页码:1 / 10
页数:10
相关论文
共 50 条
  • [1] The GAAIN Entity Mapper: Towards Practical Medical Informatics Application
    Dewan, Peehoo
    DATA INTEGRATION IN THE LIFE SCIENCES, DILS 2015, 2015, 9162 : 265 - 270
  • [2] Introduction of social media to aid active-learning in medical teaching
    Shen, Jie
    INTERACTIVE LEARNING ENVIRONMENTS, 2022, 30 (10) : 1932 - 1939
  • [3] Active-Learning Approaches for Landslide Mapping Using Support Vector Machines
    Wang, Zhihao
    Brenning, Alexander
    REMOTE SENSING, 2021, 13 (13)
  • [5] SEED: A software tool and an active-learning strategy for data structures courses
    Adarme, Marco
    Jabba Molinares, Daladier
    COMPUTER APPLICATIONS IN ENGINEERING EDUCATION, 2018, 26 (02) : 302 - 313
  • [6] EXPLOITING ACTIVE-LEARNING STRATEGIES FOR ANNOTATING PROSODIC EVENTS WITH LIMITED LABELED DATA
    Fernandez, Raul
    Ramabhadran, Bhuvana
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 2208 - 2211
  • [7] Entity Matching on Unstructured Data: An Active Learning Approach
    Brunner, Ursin
    Stockinger, Kurt
    2019 6TH SWISS CONFERENCE ON DATA SCIENCE (SDS), 2019, : 97 - 102
  • [8] System reliability analyses of slopes based on active-learning radial basis function
    Zhang Tian-long
    Zeng Peng
    Li Tian-bin
    Sun Xiao-ping
    ROCK AND SOIL MECHANICS, 2020, 41 (09) : 3098 - 3108
  • [9] Time-Variant Reliability Analysis for a Complex System Based on Active-Learning Kriging Model
    Qian, Hua-Ming
    Huang, Hong-Zhong
    Wei, Jing
    ASCE-ASME JOURNAL OF RISK AND UNCERTAINTY IN ENGINEERING SYSTEMS PART A-CIVIL ENGINEERING, 2023, 9 (01):
  • [10] Mapping computerized clinical guidelines to electronic medical records: Knowledge-data ontological mapper (KDOM)
    Peleg, Mor
    Keren, Sagi
    Denekamp, Yaron
    JOURNAL OF BIOMEDICAL INFORMATICS, 2008, 41 (01) : 180 - 201