Identifying Differentially Expressed Genes in RNA Sequencing Data With Small Labelled Samples

被引:0
|
作者
Guo, Yin [1 ]
Xiao, Yanni [1 ]
Li, Limin [1 ]
机构
[1] Xi An Jiao Tong Univ, Sch Math & Stat, Xian 710049, Peoples R China
基金
中国国家自然科学基金;
关键词
Differentially expressed genes; small sample problem; two-sample independent test; auxiliary sample; wilcoxon-mann-whitney test; POWER; PROGRESS; SEQ;
D O I
10.1109/TCBB.2024.3382147
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
RNA-seq, including bulk RNA-seq and single-cell RNA-seq, is a next-generation sequencing-based RNA profiling method capable of measuring gene expression patterns with high resolution, and has gradually become an essential tool for the analysis of differential gene expression at the whole transcriptome level. Differential gene identification is a key problem in many biological studies such as disease genetics. Two-sample location test methods are widely used in case-control studies to identify the significant differential genes. However, due to the high cost of labelled data collection, many studies face the small sample problem since there is only small labelled data available, for which the traditional methods often lose power. To address this issue, we propose a novel rank-based nonparametric test method called WMW-A test based on Wilcoxon-Mann-Whitiney test by introducing a three-sample statistic through another auxiliary sample, which is either given or generated in form of unlabelled data. By combining the case, control and auxiliary samples together, we construct a three-sample WMW-A statistic based on the gap between the average ranks of the case and control samples in the combined samples. The extensive simulation experiments and real applications on different gene expression datasets, including one bulk RNA-seq dataset and two single cell RNA-seq datasets, show that the WMW-A test could significantly improve the test power for two-sample problem with small sample sizes, by either available or generated auxiliary data. The applications on two real small SARS-CoV-2 datasets further show the improvement of WMW-A test for differentially expressed gene identification with small labelled samples.
引用
收藏
页码:1311 / 1321
页数:11
相关论文
共 50 条
  • [31] Robustness of single-cell RNA-seq for identifying differentially expressed genes
    Liu, Yong
    Huang, Jing
    Pandey, Rajan
    Liu, Pengyuan
    Therani, Bhavika
    Qiu, Qiongzi
    Rao, Sridhar
    Geurts, Aron M.
    Cowley Jr, Allen W.
    Greene, Andrew S.
    Liang, Mingyu
    BMC GENOMICS, 2023, 24 (01)
  • [32] Ranking analysis of microarray data: A powerful method for identifying differentially expressed genes
    Tan, Yuan-De
    Fornage, Myriam
    Fu, Yun-Xin
    GENOMICS, 2006, 88 (06) : 846 - 854
  • [33] An Integrative Bioinformatics Analysis of Microarray Data for Identifying Differentially Expressed Genes in Preeclampsia
    Song, L. M.
    Long, M.
    Song, S. J.
    Wang, J. R.
    Zhao, G. W.
    Zhao, N.
    RUSSIAN JOURNAL OF GENETICS, 2022, 58 (07) : 866 - 875
  • [34] Identification of differentially expressed genes in Budd-Chiari syndrome by RNA-sequencing
    Yang, Bin
    Qu, Dong
    Zhao, An-Li
    Li, Yu
    Meng, Ran-Ran
    Yu, Ji-Xiang
    Gao, Peng
    Lin, Hua Peng
    MOLECULAR MEDICINE REPORTS, 2017, 16 (06) : 8011 - 8018
  • [35] An Integrative Bioinformatics Analysis of Microarray Data for Identifying Differentially Expressed Genes in Preeclampsia
    L. M. Song
    M. Long
    S. J. Song
    J. R. Wang
    G. W. Zhao
    N. Zhao
    Russian Journal of Genetics, 2022, 58 : 866 - 875
  • [36] RNA sequencing analyses reveal novel differentially expressed genes and pathways in pancreatic cancer
    Mao, Yixiang
    Shen, Jianjun
    Lu, Yue
    Lin, Kevin
    Wang, Huamin
    Li, Yanan
    Chang, Ping
    Walker, Mary G.
    Li, Donghui
    ONCOTARGET, 2017, 8 (26): : 42537 - 42547
  • [37] Bioinformatics Analysis of Differentially Expressed Genes in Carpal Tunnel Syndrome Using RNA Sequencing
    Oveisee, Maziar
    Gholipour, Akram
    Moghaddam, Mahrokh Bagheri
    Malakootian, Mahshid
    IRANIAN JOURNAL OF PUBLIC HEALTH, 2024, 53 (08) : 1871 - 1882
  • [38] Identifying differentially expressed genes in goat mammary epithelial cells induced by overexpression of SOCS3 gene using RNA sequencing
    Song, Ning
    Ma, Cunxia
    Guo, Yuzhu
    Cui, Shuangshuang
    Chen, Shihao
    Chen, Zhi
    Ling, Yinghui
    Zhang, Yunhai
    Liu, Hongyu
    FRONTIERS IN VETERINARY SCIENCE, 2024, 11
  • [39] Selective differential fingerprinting - A method for identifying differentially expressed genes in a family between two samples
    Weber, KL
    Bolander, ME
    Sarkar, G
    MOLECULAR BIOTECHNOLOGY, 1998, 10 (01) : 77 - 81
  • [40] Sparse Orthogonal Nonnegative Matrix Factorization for Identifying Differentially Expressed Genes and Clustering Tumor Samples
    Dai, Ling-Yun
    Liu, Jin-Xing
    Zhu, Rong
    Kong, Xiang-Zhen
    Hou, Mi-Xiao
    Yuan, Sha-Sha
    PROCEEDINGS 2018 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2018, : 1332 - 1337