A prototype knockoff filter for group selection with FDR control

被引:4
|
作者
Chen, Jiajie [1 ]
Hou, Anthony [2 ]
Hou, Thomas Y. [3 ]
机构
[1] Peking Univ, Sch Math Sci, Beijing 100871, Peoples R China
[2] Harvard Univ, Dept Stat, Cambridge, MA 02138 USA
[3] CALTECH, Appl & Computat Math, Pasadena, CA 91125 USA
基金
美国国家科学基金会;
关键词
variable selection; false discovery rate (FDR); group variable selection; knockoff filter; linear regression; FALSE DISCOVERY RATE;
D O I
10.1093/imaiai/iaz012
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
In many applications, we need to study a linear regression model that consists of a response variable and a large number of potential explanatory variables, and determine which variables are truly associated with the response. In Foygel Barber & Candes (2015, Ann. Statist., 43, 2055-2085), the authors introduced a new variable selection procedure called the knockoff filter to control the false discovery rate (FDR) and proved that this method achieves exact FDR control. In this paper, we propose a prototype knockoff filter for group selection by extending the Reid-Tibshirani (2016, Biostatistics, 17, 364-376) prototype method. Our prototype knockoff filter improves the computational efficiency and statistical power of the Reid-Tibshirani prototype method when it is applied for group selection. In some cases when the group features are spanned by one or a few hidden factors, we demonstrate that the Principal Component Analysis (PCA) prototype knockoff filter outperforms the Dai-Foygel Barber (2016, 33rd International Conference on Machine Learning (ICML 2016)) group knockoff filter. We present several numerical experiments to compare our prototype knockoff filter with the Reid-Tibshirani prototype method and the group knockoff filter. We have also conducted some analysis of the knockoff filter. Our analysis reveals that some knockoff path method statistics, including the Lasso path statistic, may lead to loss of power for certain design matrices and a specially designed response even if their signal strengths are still relatively strong.
引用
收藏
页码:271 / 288
页数:18
相关论文
共 50 条
  • [1] The knockoff filter for FDR control in group-sparse and multitask regression
    Dai, Ran
    Barber, Rina Foygel
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
  • [2] A generalized knockoff procedure for FDR control in structural change detection
    Liu, Jingyuan
    Sun, Ao
    Ke, Yuan
    JOURNAL OF ECONOMETRICS, 2024, 239 (01)
  • [3] Model-Free Feature Screening and FDR Control With Knockoff Features
    Liu, Wanjun
    Ke, Yuan
    Liu, Jingyuan
    Li, Runze
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2022, 117 (537) : 428 - 443
  • [4] DIFFERENTIALLY PRIVATE VARIABLE SELECTION VIA THE KNOCKOFF FILTER
    Pournaderi, Mehrdad
    Xiang, Yu
    2021 IEEE 31ST INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2021,
  • [5] MULTILAYER KNOCKOFF FILTER: CONTROLLED VARIABLE SELECTION AT MULTIPLE RESOLUTIONS
    Katsevich, Eugene
    Sabatti, Chiara
    ANNALS OF APPLIED STATISTICS, 2019, 13 (01): : 1 - 33
  • [6] Output-Related and -Unrelated Fault Monitoring with an Improvement Prototype Knockoff Filter and Feature Selection Based on Laplacian Eigen Maps and Sparse Regression
    Xue, Cuiping
    Zhang, Tie
    Xiao, Dong
    ACS OMEGA, 2021, 6 (16): : 10828 - 10839
  • [7] Feature screening and FDR control with knockoff features for ultrahigh-dimensional right-censored data
    Pan, Yingli
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2022, 173
  • [8] A pseudo knockoff filter for correlated features
    Chen, Jiajie
    Hou, Anthony
    Hou, Thomas Y.
    INFORMATION AND INFERENCE-A JOURNAL OF THE IMA, 2019, 8 (02) : 313 - 341
  • [9] GGM knockoff filter: False discovery rate control for Gaussian graphical models
    Li, Jinzhou
    Maathuis, Marloes H.
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2021, 83 (03) : 534 - 558
  • [10] Grace-AKO: a novel and stable knockoff filter for variable selection incorporating gene network structures
    Peixin Tian
    Yiqian Hu
    Zhonghua Liu
    Yan Dora Zhang
    BMC Bioinformatics, 23