NeoMutate: an ensemble machine learning framework for the prediction of somatic mutations in cancer

被引:32
|
作者
Anzar, Irantzu [1 ]
Sverchkova, Angelina [1 ]
Stratford, Richard [1 ]
Clancy, Trevor [1 ]
机构
[1] OncoImmunity AS, Oslo Canc Cluster, Ullernchausseen 64-66, N-0379 Oslo, Norway
关键词
Somatic variant detection; Machine learning; Cancer genomics; Precision medicine; POINT MUTATIONS; IDENTIFICATION; ALGORITHMS; DISCOVERY; VARIANTS; PIPELINE;
D O I
10.1186/s12920-019-0508-5
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
BackgroundThe accurate screening of tumor genomic landscapes for somatic mutations using high-throughput sequencing involves a crucial step in precise clinical diagnosis and targeted therapy. However, the complex inherent features of cancer tissue, especially, tumor genetic intra-heterogeneity coupled with the problem of sequencing and alignment artifacts, makes somatic variant calling a challenging task. Current variant filtering strategies, such as rule-based filtering and consensus voting of different algorithms, have previously helped to increase specificity, although comes at the cost of sensitivity.MethodsIn light of this, we have developed the NeoMutate framework which incorporates 7 supervised machine learning (ML) algorithms to exploit the strengths of multiple variant callers, using a non-redundant set of biological and sequence features. We benchmarked NeoMutate by simulating more than 10,000 bona fide cancer-related mutations into three well-characterized Genome in a Bottle (GIAB) reference samples.ResultsA robust and exhaustive evaluation of NeoMutate's performance based on 5-fold cross validation experiments, in addition to 3 independent tests, demonstrated a substantially improved variant detection accuracy compared to any of its individual composite variant callers and consensus calling of multiple tools.ConclusionsWe show here that integrating multiple tools in an ensemble ML layer optimizes somatic variant detection rates, leading to a potentially improved variant selection framework for the diagnosis and treatment of cancer.
引用
收藏
页数:14
相关论文
共 50 条
  • [21] An ensemble framework for explainable geospatial machine learning models
    Liu, Lingbo
    INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2024, 132
  • [22] Application of machine learning ensemble models for rainfall prediction
    Hasan Ahmadi
    Babak Aminnejad
    Hojat Sabatsany
    Acta Geophysica, 2023, 71 : 1775 - 1786
  • [23] Oil Price Prediction Using Ensemble Machine Learning
    Gabralla, Lubna A.
    Jammazi, Rania
    Abraham, Ajith
    2013 INTERNATIONAL CONFERENCE ON COMPUTING, ELECTRICAL AND ELECTRONICS ENGINEERING (ICCEEE), 2013, : 674 - 679
  • [24] A stacked ensemble machine learning approach for the prediction of diabetes
    Oliullah, Khondokar
    Rasel, Mahedi Hasan
    Islam, Md. Manzurul
    Islam, Md. Reazul
    Wadud, Md. Anwar Hussen
    Whaiduzzaman, Md.
    JOURNAL OF DIABETES AND METABOLIC DISORDERS, 2024, 23 (01) : 603 - 617
  • [25] Prediction of plant lncRNA by ensemble machine learning classifiers
    Caitlin M. A. Simopoulos
    Elizabeth A. Weretilnyk
    G. Brian Golding
    BMC Genomics, 19
  • [26] A Machine Learning Ensemble Classifier for Prediction of Brain Strokes
    Mostafa, Samaa A.
    Elzanfaly, Doaa S.
    Yakoub, Ahmed E.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (12) : 258 - 266
  • [27] Obesity Prediction Using Ensemble Machine Learning Approaches
    Jindal, Kapil
    Baliyan, Niyati
    Rana, Prashant Singh
    RECENT FINDINGS IN INTELLIGENT COMPUTING TECHNIQUES, VOL 2, 2018, 708 : 355 - 362
  • [28] BIMSSA: enhancing cancer prediction with salp swarm optimization and ensemble machine learning approaches
    Panda, Pinakshi
    Bisoy, Sukant Kishoro
    Panigrahi, Amrutanshu
    Pati, Abhilash
    Sahu, Bibhuprasad
    Guo, Zheshan
    Liu, Haipeng
    Jain, Prince
    FRONTIERS IN GENETICS, 2025, 15
  • [29] Application of machine learning ensemble models for rainfall prediction
    Ahmadi, Hasan
    Aminnejad, Babak
    Sabatsany, Hojat
    ACTA GEOPHYSICA, 2023, 71 (04) : 1775 - 1786
  • [30] Prediction of drug synergy in cancer using ensemble-based machine learning techniques
    Singh, Harpreet
    Rana, Prashant Singh
    Singh, Urvinder
    MODERN PHYSICS LETTERS B, 2018, 32 (11):