Detecting weak signals in high dimensions

被引:1
|
作者
Jeng, X. Jessie [1 ]
机构
[1] N Carolina State Univ, Dept Stat, SAS Hall,2311 Stinson Dr, Raleigh, NC 27695 USA
关键词
False negative control; Multiple testing; Variable screening; Variable selection; Trichotomous analysis; COPY-NUMBER VARIATION; FALSE DISCOVERY RATE; PROPORTION; SELECTION; ORACLE; RULES; POWER; SIZE;
D O I
10.1016/j.jmva.2016.02.004
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Fast emerging high-throughput technology advances scientific applications into a new era by enabling detection of information-bearing signals with unprecedented sizes. Despite its potential, the analysis of ultrahigh-dimensional data involves fundamental challenges, wherein the deluge of a large amount of irrelevant data can easily obscure the true signals. Classical statistical methods for low to moderate-dimensional data focus on identifying strong true signals using false positive control criteria. These methods, however, have limited power for identifying weak true signals embedded in an extremely large amount of noise. This paper seeks to facilitate the detection of weak signals by introducing a new approach based on false negative instead of false positive control. As a result, a high proportion of weak signals can be retained for follow-up study. The new procedure is completely data-driven and fast in computation. We show in theory its efficiency and adaptivity to the unknown features of the data including signal intensity and sparsity. Simulation studies further evaluate the method under various model settings. We apply the new method in a real-data analysis on detecting genomic variants with varying signal intensities. (C) 2016 Elsevier Inc. All rights reserved.
引用
收藏
页码:234 / 246
页数:13
相关论文
共 50 条