PathME: pathway based multi-modal sparse autoencoders for clustering of patient-level multi-omics data

被引:39
|
作者
Lemsara, Amina [1 ]
Ouadfel, Salima [1 ]
Froehlich, Holger [2 ,3 ]
机构
[1] Univ Constantine 2, Comp Sci Dept, Constantine 25016, Algeria
[2] Univ Bonn, Int Ctr IT, D-53115 Bonn, Germany
[3] Fraunhofer Inst for Algorithms & Sci Comp SCAI, D-53754 Sankt, Augustin, Germany
关键词
Deep learning; Patient clustering; Multi-omics; MOLECULAR PORTRAITS; PROGNOSTIC-FACTOR; CLASS DISCOVERY; BREAST-CANCER; LUNG-CANCER; EXPRESSION; METHYLATION; MUTATION; SUBTYPES; PROTEIN;
D O I
10.1186/s12859-020-3465-2
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background Recent years have witnessed an increasing interest in multi-omics data, because these data allow for better understanding complex diseases such as cancer on a molecular system level. In addition, multi-omics data increase the chance to robustly identify molecular patient sub-groups and hence open the door towards a better personalized treatment of diseases. Several methods have been proposed for unsupervised clustering of multi-omics data. However, a number of challenges remain, such as the magnitude of features and the large difference in dimensionality across different omics data sources. Results We propose a multi-modal sparse denoising autoencoder framework coupled with sparse non-negative matrix factorization to robustly cluster patients based on multi-omics data. The proposed model specifically leverages pathway information to effectively reduce the dimensionality of omics data into a pathway and patient specific score profile. In consequence, our method allows us to understand, which pathway is a feature of which particular patient cluster. Moreover, recently proposed machine learning techniques allow us to disentangle the specific impact of each individual omics feature on a pathway score. We applied our method to cluster patients in several cancer datasets using gene expression, miRNA expression, DNA methylation and CNVs, demonstrating the possibility to obtain biologically plausible disease subtypes characterized by specific molecular features. Comparison against several competing methods showed a competitive clustering performance. In addition, post-hoc analysis of somatic mutations and clinical data provided supporting evidence and interpretation of the identified clusters. Conclusions Our suggested multi-modal sparse denoising autoencoder approach allows for an effective and interpretable integration of multi-omics data on pathway level while addressing the high dimensional character of omics data. Patient specific pathway score profiles derived from our model allow for a robust identification of disease subgroups.
引用
收藏
页数:20
相关论文
共 50 条
  • [1] PathME: pathway based multi-modal sparse autoencoders for clustering of patient-level multi-omics data
    Amina Lemsara
    Salima Ouadfel
    Holger Fröhlich
    BMC Bioinformatics, 21
  • [2] A deep contrastive multi-modal encoder for multi-omics data integration and analysis
    Yinghua, Ma
    Khan, Ahmad
    Heng, Yang
    Khan, Fiaz Gul
    Ali, Farman
    Al-Otaibi, Yasser D.
    Bashir, Ali Kashif
    INFORMATION SCIENCES, 2025, 700
  • [3] MOVIS: A multi-omics software solution for multi-modal time-series clustering, embedding, and visualizing tasks
    Anzel, Aleksandar
    Heider, Dominik
    Hattab, Georges
    COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2022, 20 : 1044 - 1055
  • [4] Multi-modal data novelty detection with adversarial autoencoders
    Chen, Zeqiu
    Zhao, Kaiyi
    Sun, Ruizhi
    APPLIED SOFT COMPUTING, 2024, 165
  • [5] Integrative clustering methods for multi-omics data
    Zhang, Xiaoyu
    Zhou, Zhenwei
    Xu, Hanfei
    Liu, Ching-Ti
    WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2022, 14 (03)
  • [6] Representation Learning for the Clustering of Multi-Omics Data
    Viaud, Gautier
    Mayilvahanan, Prasanna
    Cournede, Paul-Henry
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2022, 19 (01) : 135 - 145
  • [7] Editorial: Integrative multi-modal, multi-omics analytics for the better understanding of metabolic diseases
    Acharjee, Animesh
    Agarwal, Prasoon
    Gkoutos, Georgios V.
    FRONTIERS IN ENDOCRINOLOGY, 2023, 14
  • [8] DeepDRA: Drug repurposing using multi-omics data integration with autoencoders
    Mohammadzadeh-Vardin, Taha
    Ghareyazi, Amin
    Gharizadeh, Ali
    Abbasi, Karim
    Rabiee, Hamid R.
    PLOS ONE, 2024, 19 (07):
  • [9] A survey on data integration for multi-omics sample clustering
    Lovino, Marta
    Randazzo, Vincenzo
    Ciravegna, Gabriele
    Barbiero, Pietro
    Ficarra, Elisa
    Cirrincione, Giansalvo
    NEUROCOMPUTING, 2022, 488 : 494 - 508
  • [10] Spectral clustering of weighted variables on multi-omics data
    Lee, Yunjung
    Park, Seyoung
    KOREAN JOURNAL OF APPLIED STATISTICS, 2023, 36 (03) : 175 - 196