Integration of pan-cancer multi-omics data for novel mixed subgroup identification using machine learning methods

被引:4
|
作者
Khadirnaikar, Seema [1 ]
Shukla, Sudhanshu [2 ]
Prasanna, S. R. M. [1 ]
机构
[1] Indian Inst Technol Dharwad, Dept Elect Engn, Dharwad, Karnataka, India
[2] Indian Inst Technol Dharwad, Dept Biosci & Bioengn, Dharwad, Karnataka, India
来源
PLOS ONE | 2023年 / 18卷 / 10期
关键词
MOLECULAR CLASSIFICATION; HETEROGENEITY;
D O I
10.1371/journal.pone.0287176
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Cancer is a heterogeneous disease, and patients with tumors from different organs can share similar epigenetic and genetic alterations. Therefore, it is crucial to identify the novel subgroups of patients with similar molecular characteristics. It is possible to propose a better treatment strategy when the heterogeneity of the patient is accounted for during subgroup identification, irrespective of the tissue of origin. This work proposes a machine learning (ML) based pipeline for subgroup identification in pan-cancer. Here, mRNA, miRNA, DNA methylation, and protein expression features from pan-cancer samples were concatenated and non-linearly projected to a lower dimension using an ML algorithm. This data was then clustered to identify multi-omics-based novel subgroups. The clinical characterization of these ML subgroups indicated significant differences in overall survival (OS) and disease-free survival (DFS) (p-value<0.0001). The subgroups formed by the patients from different tumors shared similar molecular alterations in terms of immune microenvironment, mutation profile, and enriched pathways. Further, decision-level and feature-level fused classification models were built to identify the novel subgroups for unseen samples. Additionally, the classification models were used to obtain the class labels for the validation samples, and the molecular characteristics were verified. To summarize, this work identified novel ML subgroups using multi-omics data and showed that the patients with different tumor types could be similar molecularly. We also proposed and validated the classification models for subgroup identification. The proposed classification models can be used to identify the novel multi-omics subgroups, and the molecular characteristics of each subgroup can be used to design appropriate treatment regimen.
引用
收藏
页数:21
相关论文
共 50 条
  • [1] Identification of Pan-Cancer Prognostic Biomarkers Through Integration of Multi-Omics Data
    Zhao, Ning
    Guo, Maozu
    Wang, Kuanquan
    Zhang, Chunlong
    Liu, Xiaoyan
    FRONTIERS IN BIOENGINEERING AND BIOTECHNOLOGY, 2020, 8
  • [2] Pan-cancer classification of multi-omics data based on machine learning models
    Cava, Claudia
    Sabetian, Soudabeh
    Salvatore, Christian
    Castiglioni, Isabella
    NETWORK MODELING AND ANALYSIS IN HEALTH INFORMATICS AND BIOINFORMATICS, 2024, 13 (01):
  • [3] Machine learning for multi-omics data integration in cancer
    Cai, Zhaoxiang
    Poulos, Rebecca C.
    Liu, Jia
    Zhong, Qing
    ISCIENCE, 2022, 25 (02)
  • [4] Uncovering clinically relevant omics signatures from pan-cancer imaging and multi-omics data integration
    Wang, Joshua
    Hong, Runyu
    Tan, Jimin
    Liu, Wenke
    Fenyo, David
    CANCER RESEARCH, 2024, 84 (06)
  • [5] Machine learning algorithms and biomarkers identification for pancreatic cancer diagnosis using multi-omics data integration
    Rouzbahani, Arian Karimi
    Khalili-Tanha, Ghazaleh
    Rajabloo, Yasamin
    Khojasteh-Leylakoohi, Fatemeh
    Garjan, Hassan Shokri
    Nazari, Elham
    Avan, Amir
    PATHOLOGY RESEARCH AND PRACTICE, 2024, 263
  • [6] A pan-cancer integrative pathway analysis of multi-omics data
    Henry Linder
    Yuping Zhang
    Quantitative Biology, 2020, 8 (02) : 130 - 142
  • [7] A pan-cancer integrative pathway analysis of multi-omics data
    Linder, Henry
    Zhang, Yuping
    QUANTITATIVE BIOLOGY, 2020, 8 (02) : 130 - 142
  • [8] Integration strategies of multi-omics data for machine learning analysis
    Picard M.
    Scott-Boyer M.-P.
    Bodein A.
    Périn O.
    Droit A.
    Computational and Structural Biotechnology Journal, 2021, 19 : 3735 - 3746
  • [9] Integration strategies of multi-omics data for machine learning analysis
    Picard, Milan
    Scott-Boyer, Marie -Pier
    Bodein, Antoine
    Perin, Olivier
    Droit, Arnaud
    COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2021, 19 : 3735 - 3746
  • [10] MDICC: novel method for multi-omics data integration and cancer subtype identification
    Yang, Ying
    Tian, Sha
    Qiu, Yushan
    Zhao, Pu
    Zou, Quan
    BRIEFINGS IN BIOINFORMATICS, 2022, 23 (03)