MarkerML - Marker Feature Identification in Metagenomic Datasets Using Interpretable Machine Learning

被引:7
|
作者
Nagpal, Sunil [1 ,2 ,3 ]
Singh, Rohan [1 ]
Taneja, Bhupesh [2 ,3 ]
Mande, Sharmila S. [1 ]
机构
[1] Tata Consultancy Serv Ltd, TCS Res, Pune 411013, India
[2] CSIR, Inst Genom & Integrat Biol GIB, New Delhi 110025, India
[3] Acad Sci & Innovat Res AcSIR, Ghaziabad 201002, India
关键词
metagenomic biomarkers; interpretable machine learning; SHAP; microbiome; marker features; DATABASE;
D O I
10.1016/j.jmb.2022.167589
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Identification of environment specific marker-features is one of the key objectives of many metagenomic studies. It aims to identify such features in microbiome datasets that may serve as markers of the contrasting or comparable states. Hypothesis testing and black-box machine learnt models which are conventionally used for identification of these features are generally not exhaustive, especially because they generally do-not provide any quantifiable relevance (context) of/between the identified features. We present MarkerML web-server, that seeks to leverage the emergence of interpretable machine learning for facilitating the contextual discovery of metagenomic features of interest. It does so through a comprehensive and automated application of the concept of Shapley Additive Explanations in companionship to the compositionality accounted hypothesis testing for the multi-variate microbiome datasets. MarkerML not only helps in identification of marker-features, but also enables insights into the role and interdependence of the identified features in driving the decision making of the supervised machine learnt model. Generation of high quality and intuitive visualizations spanning prediction effect plots, model performance reports, feature dependency plots, Shapley and abundance informed cladograms (Sungrams), hypothesis tested violin plots along-with necessary provisions for excluding the participant bias and ensuring reproducibility of results, further seek to make the platform a useful asset for the scientists in the field of microbiome (and even beyond). The MarkerML web-server is freely available for the academic community at https://microbiome.igib.res.in/markerml/.(c) 2022 Elsevier Ltd. All rights reserved.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] Interpretable Machine Learning Using Partial Linear Models
    Flachaire, Emmanuel
    Hue, Sullivan
    Laurent, Sebastien
    Hacheme, Gilles
    OXFORD BULLETIN OF ECONOMICS AND STATISTICS, 2024, 86 (03) : 519 - 540
  • [32] Pest Presence Prediction Using Interpretable Machine Learning
    Nanushi, Ornela
    Sitokonstantinou, Vasileios
    Tsoumas, Ilias
    Kontoes, Charalampos
    2022 IEEE 14TH IMAGE, VIDEO, AND MULTIDIMENSIONAL SIGNAL PROCESSING WORKSHOP (IVMSP), 2022,
  • [33] PhageScanner: a reconfigurable machine learning framework for bacteriophage genomic and metagenomic feature annotation
    Albin, Dreycey
    Ramsahoye, Michelle
    Kochavi, Eitan
    Alistar, Mirela
    FRONTIERS IN MICROBIOLOGY, 2024, 15
  • [34] Identification of Colon Immune Cell Marker Genes Using Machine Learning Methods
    Yang, Yong
    Zhang, Yuhang
    Ren, Jingxin
    Feng, Kaiyan
    Li, Zhandong
    Huang, Tao
    Cai, Yudong
    LIFE-BASEL, 2023, 13 (09):
  • [35] Using machine learning and feature engineering to characterize limited material datasets of high-entropy alloys
    Dai, Dongbo
    Xu, Tao
    Wei, Xiao
    Ding, Guangtai
    Xu, Yan
    Zhang, Jincang
    Zhang, Huiran
    COMPUTATIONAL MATERIALS SCIENCE, 2020, 175 (175)
  • [36] Interpretable machine learning for the identification of estrogen receptor agonists, antagonists, and binders
    Piir G.
    Sild S.
    Maran U.
    Chemosphere, 2024, 347
  • [37] Interpretable Machine Learning Model for Default Risk Identification of Corporate Bonds
    Deng, Shangkun
    Ning, Hong
    Liu, Zonghua
    Zhu, Yingke
    Computer Engineering and Applications, 2024, 60 (12) : 334 - 345
  • [38] Time Delay Identification in Dynamical Systems Based on Interpretable Machine Learning
    夏梦
    吴毓哲
    王直杰
    JournalofDonghuaUniversity(EnglishEdition), 2022, 39 (04) : 332 - 339
  • [39] EEG Feature Fusion for Person Identification Using Efficient Machine Learning Approach
    Alyasseri, Zaid Abdi Alkareem
    Al-Betar, Mohammed Azmi
    Awadallah, Mohammed A.
    Makhadmeh, Sharif Naser
    Alomari, Osama Ahmad
    Abasi, Ammar Kamal
    Abu Doush, Iyad
    2021 PALESTINIAN INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY (PICICT 2021), 2021, : 97 - 102
  • [40] Snow and glacial feature identification using Hyperion dataset and machine learning algorithms
    Haq M.A.
    Alshehri M.
    Rahaman G.
    Ghosh A.
    Baral P.
    Shekhar C.
    Arabian Journal of Geosciences, 2021, 14 (15)