Integration of pan-cancer multi-omics data for novel mixed subgroup identification using machine learning methods

被引:4
|
作者
Khadirnaikar, Seema [1 ]
Shukla, Sudhanshu [2 ]
Prasanna, S. R. M. [1 ]
机构
[1] Indian Inst Technol Dharwad, Dept Elect Engn, Dharwad, Karnataka, India
[2] Indian Inst Technol Dharwad, Dept Biosci & Bioengn, Dharwad, Karnataka, India
来源
PLOS ONE | 2023年 / 18卷 / 10期
关键词
MOLECULAR CLASSIFICATION; HETEROGENEITY;
D O I
10.1371/journal.pone.0287176
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Cancer is a heterogeneous disease, and patients with tumors from different organs can share similar epigenetic and genetic alterations. Therefore, it is crucial to identify the novel subgroups of patients with similar molecular characteristics. It is possible to propose a better treatment strategy when the heterogeneity of the patient is accounted for during subgroup identification, irrespective of the tissue of origin. This work proposes a machine learning (ML) based pipeline for subgroup identification in pan-cancer. Here, mRNA, miRNA, DNA methylation, and protein expression features from pan-cancer samples were concatenated and non-linearly projected to a lower dimension using an ML algorithm. This data was then clustered to identify multi-omics-based novel subgroups. The clinical characterization of these ML subgroups indicated significant differences in overall survival (OS) and disease-free survival (DFS) (p-value<0.0001). The subgroups formed by the patients from different tumors shared similar molecular alterations in terms of immune microenvironment, mutation profile, and enriched pathways. Further, decision-level and feature-level fused classification models were built to identify the novel subgroups for unseen samples. Additionally, the classification models were used to obtain the class labels for the validation samples, and the molecular characteristics were verified. To summarize, this work identified novel ML subgroups using multi-omics data and showed that the patients with different tumor types could be similar molecularly. We also proposed and validated the classification models for subgroup identification. The proposed classification models can be used to identify the novel multi-omics subgroups, and the molecular characteristics of each subgroup can be used to design appropriate treatment regimen.
引用
收藏
页数:21
相关论文
共 50 条
  • [31] Supervised graph contrastive learning for cancer subtype identification through multi-omics data integration
    Chen, Fangxu
    Peng, Wei
    Dai, Wei
    Wei, Shoulin
    Fu, Xiaodong
    Liu, Li
    Liu, Lijun
    HEALTH INFORMATION SCIENCE AND SYSTEMS, 2024, 12 (01)
  • [32] TCGAplot: an R package for integrative pan-cancer analysis and visualization of TCGA multi-omics data
    Chenqi Liao
    Xiong Wang
    BMC Bioinformatics, 24
  • [33] Pan-cancer analysis from multi-omics data reveals AAMP as an unfavourable prognostic marker
    Wang, Yang
    Liu, Ting
    Zhang, Ke
    Huang, Rong-hai
    Jiang, Li
    EUROPEAN JOURNAL OF MEDICAL RESEARCH, 2023, 28 (01)
  • [34] A feature extraction framework for discovering pan-cancer driver genes based on multi-omics data
    Xue, Xiaomeng
    Li, Feng
    Shang, Junliang
    Dai, Lingyun
    Ge, Daohui
    Ren, Qianqian
    QUANTITATIVE BIOLOGY, 2024, 12 (02) : 173 - 181
  • [35] TCGAplot: an R package for integrative pan-cancer analysis and visualization of TCGA multi-omics data
    Liao, Chenqi
    Wang, Xiong
    BMC BIOINFORMATICS, 2023, 24 (01)
  • [36] Pan-cancer analysis from multi-omics data reveals AAMP as an unfavourable prognostic marker
    Yang Wang
    Ting Liu
    Ke Zhang
    Rong-hai Huang
    Li Jiang
    European Journal of Medical Research, 28
  • [37] Methods for the integration of multi-omics data: mathematical aspects
    Matteo Bersanelli
    Ettore Mosca
    Daniel Remondini
    Enrico Giampieri
    Claudia Sala
    Gastone Castellani
    Luciano Milanesi
    BMC Bioinformatics, 17
  • [38] Methods for the integration of multi-omics data: mathematical aspects
    Bersanelli, Matteo
    Mosca, Ettore
    Remondini, Daniel
    Giampieri, Enrico
    Sala, Claudia
    Castellani, Gastone
    Milanesi, Luciano
    BMC BIOINFORMATICS, 2016, 17
  • [39] Machine Learning: A New Prospect in Multi-Omics Data Analysis of Cancer
    Arjmand, Babak
    Hamidpour, Shayesteh Kokabi
    Tayanloo-Beik, Akram
    Goodarzi, Parisa
    Aghayan, Hamid Reza
    Adibi, Hossein
    Larijani, Bagher
    FRONTIERS IN GENETICS, 2022, 13
  • [40] Identification of GBN5 as a molecular biomarker of pan-cancer species by integrated multi-omics analysis
    Guo, Qian
    Zhong, Xinxin
    Dang, Zihan
    Zhang, Baiquan
    Yang, Zixin
    DISCOVER ONCOLOGY, 2025, 16 (01)