Pan-cancer classification by regularized multi-task learning

被引:6
|
作者
Hossain, Sk Md Mosaddek [1 ]
Khatun, Lutfunnesa [2 ]
Ray, Sumanta [1 ]
Mukhopadhyay, Anirban [2 ]
机构
[1] Aliah Univ, Comp Sci & Engn, Kolkata 700160, India
[2] Univ Kalyani, Comp Sci & Engn, Kalyani 741235, W Bengal, India
关键词
INFORMATION; PROGNOSIS; MODEL;
D O I
10.1038/s41598-021-03554-8
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Classifying pan-cancer samples using gene expression patterns is a crucial challenge for the accurate diagnosis and treatment of cancer patients. Machine learning algorithms have been considered proven tools to perform downstream analysis and capture the deviations in gene expression patterns across diversified diseases. In our present work, we have developed PC-RMTL, a pan-cancer classification model using regularized multi-task learning (RMTL) for classifying 21 cancer types and adjacent normal samples using RNASeq data obtained from TCGA. PC-RMTL is observed to outperform when compared with five state-of-the-art classification algorithms, viz. SVM with the linear kernel (SVM-Lin), SVM with radial basis function kernel (SVM-RBF), random forest (RF), k-nearest neighbours (kNN), and decision trees (DT). The PC-RMTL achieves 96.07% accuracy and 95.80% MCC score for a completely unknown independent test set. The only method that appears as the real competitor is SVM-Lin, which nearly equalizes the accuracy in prediction of PC-RMTL but only when complete feature sets are provided for training; otherwise, PC-RMTL outperformed all other classification models. To the best of our knowledge, this is a significant improvement over all the existing works in pan-cancer classification as they have failed to classify many cancer types from one another reliably. We have also compared gene expression patterns of the top discriminating genes across the cancers and performed their functional enrichment analysis that uncovers several interesting facts in distinguishing pan-cancer samples.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Pan-cancer classification by regularized multi-task learning
    Sk Md Mosaddek Hossain
    Lutfunnesa Khatun
    Sumanta Ray
    Anirban Mukhopadhyay
    Scientific Reports, 11
  • [2] Task Variance Regularized Multi-Task Learning
    Mao, Yuren
    Wang, Zekai
    Liu, Weiwei
    Lin, Xuemin
    Hu, Wenbin
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (08) : 8615 - 8629
  • [3] Manifold Regularized Multi-Task Learning
    Yang, Peipei
    Zhang, Xu-Yao
    Huang, Kaizhu
    Liu, Cheng-Lin
    NEURAL INFORMATION PROCESSING, ICONIP 2012, PT III, 2012, 7665 : 528 - 536
  • [4] Uncertainty Regularized Multi-Task Learning
    Meshgi, Kourosh
    Mirzaei, Maryam Sadat
    Sekine, Satoshi
    PROCEEDINGS OF THE 12TH WORKSHOP ON COMPUTATIONAL APPROACHES TO SUBJECTIVITY, SENTIMENT & SOCIAL MEDIA ANALYSIS, 2022, : 78 - 88
  • [5] Using Regularized Multi-Task Learning for Schizophrenia MRI Data Classification
    Wang, Yu
    Shi, Jiantong
    Xiao, Hongbing
    JOURNAL OF INTEGRATIVE NEUROSCIENCE, 2022, 21 (04)
  • [6] Cancer Classification with Multi-task Deep Learning
    Liao, Qing
    Jiang, Lin
    Wang, Xuan
    Zhang, Chunkai
    Ding, Ye
    2017 INTERNATIONAL CONFERENCE ON SECURITY, PATTERN ANALYSIS, AND CYBERNETICS (SPAC), 2017, : 76 - 81
  • [7] Sign-Regularized Multi-Task Learning
    Bai, Guangji
    Torres, Johnny
    Wang, Junxiang
    Zhao, Liang
    Abad, Cristina
    Vaca, Carmen
    PROCEEDINGS OF THE 2023 SIAM INTERNATIONAL CONFERENCE ON DATA MINING, SDM, 2023, : 793 - 801
  • [8] Multi-task learning for regularized PET reconstruction
    Yang, Bao
    Zhu, Wentao
    JOURNAL OF NUCLEAR MEDICINE, 2021, 62
  • [9] Saliency-Regularized Deep Multi-Task Learning
    Bai, Guangji
    Zhao, Liang
    PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 15 - 25
  • [10] Safe sample screening for regularized multi-task learning
    Mei, Benshan
    Xu, Yitian
    KNOWLEDGE-BASED SYSTEMS, 2020, 204