Utilizing somatic mutation data from numerous studies for cancer research: proof of concept and applications

被引:13
|
作者
Amar, D. [1 ]
Izraeli, S. [2 ,3 ]
Shamir, R. [1 ]
机构
[1] Tel Aviv Univ, Blavatnik Sch Comp Sci, Tel Aviv Univ Campus, IL-69978 Tel Aviv, Israel
[2] Sheba Med Ctr, Safra Childrens Hosp, Dept Pediat Hematol Oncol, Ramat Gan, Israel
[3] Tel Aviv Univ, Sackler Sch Med, Tel Aviv, Israel
基金
以色列科学基金会;
关键词
HEDGEHOG SIGNALING PATHWAY; GENE-EXPRESSION PROFILES; INTEGRATED ANALYSIS; IDENTIFICATION; CYTOSCAPE; ONTOLOGY; BIOLOGY; SMAD4; KRAS;
D O I
10.1038/onc.2016.489
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Large cancer projects measure somatic mutations in thousands of samples, gradually assembling a catalog of recurring mutations in cancer. Many methods analyze these data jointly with auxiliary information with the aim of identifying subtype-specific results. Here, we show that somatic gene mutations alone can reliably and specifically predict cancer subtypes. Interpretation of the classifiers provides useful insights for several biomedical applications. We analyze the COSMIC database, which collects somatic mutations from The Cancer Genome Atlas (TCGA) as well as from many smaller scale studies. We use multi-label classification techniques and the Disease Ontology hierarchy in order to identify cancer subtype-specific biomarkers. Cancer subtype classifiers based on TCGA and the smaller studies have comparable performance, and the smaller studies add a substantial value in terms of validation, coverage of additional subtypes, and improved classification. The gene sets of the classifiers are used for threefold contribution. First, we refine the associations of genes to cancer subtypes and identify novel compelling candidate driver genes. Second, using our classifiers we successfully predict the primary site of metastatic samples. Third, we provide novel hypotheses regarding detection of subtype-specific synthetic lethality interactions. From the cancer research community perspective, our results suggest that curation efforts, such as COSMIC, have great added and complementary value even in the era of large international cancer projects.
引用
收藏
页码:3375 / 3383
页数:9
相关论文
共 50 条
  • [1] Utilizing somatic mutation data from numerous studies for cancer research: proof of concept and applications
    D Amar
    S Izraeli
    R Shamir
    Oncogene, 2017, 36 : 3375 - 3383
  • [2] Inference of cancer progression from somatic mutation data
    Wu, Hao
    Gao, Lin
    Kasabov, Nikola
    IFAC PAPERSONLINE, 2015, 48 (28): : 234 - 238
  • [3] Proof of Concept to Secure the Quality of Research Data
    Azeroual, Otmane
    FOURTEENTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2021), 2022, 12084
  • [4] A concept for providing and utilizing metadata in data analytics applications
    Li, Wan
    Kleinert, Tobias
    AT-AUTOMATISIERUNGSTECHNIK, 2023, 71 (01) : 44 - 55
  • [5] FALSE DISCOVERY RATES IN SOMATIC MUTATION STUDIES OF CANCER
    Trippa, Lorenzo
    Parmigiani, Giovanni
    ANNALS OF APPLIED STATISTICS, 2011, 5 (2B): : 1360 - 1378
  • [6] Cancer subtype identification using somatic mutation data
    Kuijjer, Marieke Lydia
    Paulson, Joseph Nathaniel
    Salzman, Peter
    Ding, Wei
    Quackenbush, John
    BRITISH JOURNAL OF CANCER, 2018, 118 (11) : 1492 - 1501
  • [7] Cancer subtype identification using somatic mutation data
    Marieke Lydia Kuijjer
    Joseph Nathaniel Paulson
    Peter Salzman
    Wei Ding
    John Quackenbush
    British Journal of Cancer, 2018, 118 : 1492 - 1501
  • [8] HIERARCHICAL BAYESIAN ANALYSIS OF SOMATIC MUTATION DATA IN CANCER
    Ding, Jie
    Trippa, Lorenzo
    Zhong, Xiaogang
    Parmigiani, Giovanni
    ANNALS OF APPLIED STATISTICS, 2013, 7 (02): : 883 - 903
  • [9] An empirical Bayesian framework for somatic mutation detection from cancer genome sequencing data
    Shiraishi, Yuichi
    Sato, Yusuke
    Chiba, Kenichi
    Okuno, Yusuke
    Nagata, Yasunobu
    Yoshida, Kenichi
    Shiba, Norio
    Hayashi, Yasuhide
    Kume, Haruki
    Homma, Yukio
    Sanada, Masashi
    Ogawa, Seishi
    Miyano, Satoru
    NUCLEIC ACIDS RESEARCH, 2013, 41 (07)
  • [10] Integrated Search for Heterogeneous Data in Process Industry Applications - A Proof of Concept
    Kloepper, Benjamin
    Dix, Marcel
    Siddapura, Dikshith
    Taverne, Luke T.
    2016 IEEE 14TH INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS (INDIN), 2016, : 1306 - 1311