CART variance stabilization and regularization for high-throughput genomic data

被引：6

作者：

Papana, Ariadni

Ishwaran, Hemant

机构：

[1] Cleveland Clin, Dept Quantitat Hlth Sci, Cleveland, OH 44195 USA

[2] Case Western Reserve Univ, Dept Stat, Cleveland, OH 44106 USA

来源：

BIOINFORMATICS | 2006年 / 22卷 / 18期

基金：

美国国家科学基金会;

关键词：

D O I：

10.1093/bioinformatics/btl384

中图分类号：

Q5 [生物化学];

学科分类号：

071010 ; 081704 ;

摘要：

Motivation: mRNA expression data obtained from high-throughput DNA microarrays exhibit strong departures from homogeneity of variances. Often a complex relationship between mean expression value and variance is seen. Variance stabilization of such data is crucial for many types of statistical analyses, while regularization of variances (pooling of information) can greatly improve overall accuracy of test statistics. Results: A Classification and Regression Tree (CART) procedure is introduced for variance stabilization as well as regularization. The CART procedure adaptively clusters genes by variances. Using both local and cluster wide information leads to improved estimation of population variances which improves test statistics. Whereas making use of cluster wide information allows for variance stabilization of data.

引用

页码：2254 / 2261

页数：8

共 50 条

[31] Evaluation of genomic high-throughput sequencing data generated on Illumina HiSeq and Genome Analyzer systems
André E Minoche
Juliane C Dohm
Heinz Himmelbauer
Genome Biology, 12
[32] DupChecker: a bioconductor package for checking high-throughput genomic data redundancy in meta-analysis
Sheng, Quanhu
Shyr, Yu
Chen, Xi
BMC BIOINFORMATICS, 2014, 15
[33] InvBFM: finding genomic inversions from high-throughput sequence data based on feature mining
Zhongjia Wu
Yufeng Wu
Jingyang Gao
BMC Genomics, 21
[34] Filtering high-throughput protein-protein interaction data using a combination of genomic features
Patil, A
Nakamura, H
BMC BIOINFORMATICS, 2005, 6 (1)
[35] Filtering high-throughput protein-protein interaction data using a combination of genomic features
Ashwini Patil
Haruki Nakamura
BMC Bioinformatics, 6
[36] InvBFM: finding genomic inversions from high-throughput sequence data based on feature mining
Wu, Zhongjia
Wu, Yufeng
Gao, Jingyang
BMC GENOMICS, 2020, 21 (Suppl 1)
[37] Evaluation of genomic high-throughput sequencing data generated on Illumina HiSeq and Genome Analyzer systems
Minoche, Andre E.
Dohm, Juliane C.
Himmelbauer, Heinz
GENOME BIOLOGY, 2011, 12 (11):
[38] DupChecker: a bioconductor package for checking high-throughput genomic data redundancy in meta-analysis
Quanhu Sheng
Yu Shyr
Xi Chen
BMC Bioinformatics, 15
[39] High-throughput imaging for the systematic spatial analysis of genomic positioning
Shachar, S.
Burman, B.
Voss, T. C.
Misteli, T.
Pegoraro, G.
MOLECULAR BIOLOGY OF THE CELL, 2015, 26
[40] High-throughput method for detecting genomic-deletion polymorphisms
de la Salmonière, YOLG
Kim, CC
Tsolaki, AG
Pym, AS
Siegrist, MS
Small, PM
JOURNAL OF CLINICAL MICROBIOLOGY, 2004, 42 (07) : 2913 - 2918

← 1 2 3 4 5 →