The GAAIN Entity Mapper: An Active-Learning System for Medical Data Mapping

被引:2
|
作者
Ashish, Naveen [1 ]
Dewan, Peehoo [1 ]
Toga, Arthur W. [1 ]
机构
[1] Univ So Calif, Keck Sch Med, Lab Neuro Imaging, Stevens Neuroimaging & Informat Inst, Los Angeles, CA 90033 USA
来源
基金
美国国家卫生研究院;
关键词
data mapping; machine learning; active Learning; data harmonization; common data model; UNIFORM DATA SET;
D O I
10.3389/fninf.2015.00030
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
This work is focused on mapping biomedical datasets to a common representation, as an integral part of data harmonization for integrated biomedical data access and sharing. We present GEM, an intelligent software assistant for automated data mapping across different datasets or from a dataset to a common data model. The GEM system automates data mapping by providing precise suggestions for data element mappings. It leverages the detailed metadata about elements in associated dataset documentation such as data dictionaries that are typically available with biomedical datasets. It employs unsupervised text mining techniques to determine similarity between data elements and also employs machine-learning classifiers to identify element matches. It further provides an active-learning capability where the process of training the GEM system is optimized. Our experimental evaluations show that the GEM system provides highly accurate data mappings (over 90% accuracy) for real datasets of thousands of data elements each, in the Alzheimer's disease research domain. Further, the effort in training the system for new datasets is also optimized. We are currently employing the GEM system to map Alzheimer's disease datasets from around the globe into a common representation, as part of a global Alzheimer's disease integrated data sharing and analysis network called GAAIN(1) GEM achieves significantly higher data mapping accuracy for biomedical datasets compared to other state-of-the-art tools for database schema matching that have similar functionality. With the use of active-learning capabilities, the user effort in training the system is minimal.
引用
收藏
页码:1 / 10
页数:10
相关论文
共 50 条
  • [21] An Active Learning-Based Medical Diagnosis System
    Pinto, Catarina
    Faria, Juliana
    Macedo, Luis
    PROGRESS IN ARTIFICIAL INTELLIGENCE, EPIA 2022, 2022, 13566 : 207 - 218
  • [22] An active learning-enabled annotation system for clinical named entity recognition
    Chen, Yukun
    Lask, Thomas A.
    Mei, Qiaozhu
    Chen, Qingxia
    Moon, Sungrim
    Wang, Jingqi
    Ky Nguyen
    Dawodu, Tolulola
    Cohen, Trevor
    Denny, Joshua C.
    Xu, Hua
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2017, 17
  • [23] An active learning-enabled annotation system for clinical named entity recognition
    Yukun Chen
    Thomas A. Lask
    Qiaozhu Mei
    Qingxia Chen
    Sungrim Moon
    Jingqi Wang
    Ky Nguyen
    Tolulola Dawodu
    Trevor Cohen
    Joshua C. Denny
    Hua Xu
    BMC Medical Informatics and Decision Making, 17
  • [24] Use of a smartphone-based Student Response System in large active-learning Chemical Engineering Thermodynamics classrooms
    Caserta, Sergio
    Tomaiuolo, Giovanna
    Guido, Stefano
    EDUCATION FOR CHEMICAL ENGINEERS, 2021, 36 : 46 - 52
  • [25] Linked Data Entity Resolution System Enhanced by Configuration Learning Algorithm
    Nguyen, Khai
    Ichise, Ryutaro
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2016, E99D (06): : 1521 - 1530
  • [26] MTAAL: Multi-Task Adversarial Active Learning for Medical Named Entity Recognition and Normalization
    Zhou, Baohang
    Cai, Xiangrui
    Zhang, Ying
    Guo, Wenya
    Yuan, Xiaojie
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 14586 - 14593
  • [27] MedNER: Enhanced Named Entity Recognition in Medical Corpus via Optimized Balanced and Deep Active Learning
    Zhuang, Yan
    Zhang, Junyan
    Lu, Ruogu
    He, Kunlun
    Li, Xiuxing
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2024, 15 (05)
  • [28] Active Methodology, Educational Data Mining and Learning Analytics: A Systematic Mapping Study
    de Andrade, Tiago Luis
    Rigo, Sandro Jose
    Victoria Barbosa, Jorge Luis
    INFORMATICS IN EDUCATION, 2021, 20 (02): : 171 - 203
  • [29] Electronic Medical Record Recommendation System Based on Deep Embedding Learning with Named Entity Recognition
    Zheng, Yuqian
    Yan, Xu
    Cao, Xin
    Ai, Chunhui
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT VII, 2023, 14260 : 298 - 309
  • [30] Evaluation of Active Learning Techniques on Medical Image Classification with Unbalanced Data Distributions
    Chong, Quok Zong
    Knottenbelt, William J.
    Bhatia, Kanwal K.
    DEEP GENERATIVE MODELS, AND DATA AUGMENTATION, LABELLING, AND IMPERFECTIONS, 2021, 13003 : 235 - 242