A prototype knockoff filter for group selection with FDR control

被引:4
|
作者
Chen, Jiajie [1 ]
Hou, Anthony [2 ]
Hou, Thomas Y. [3 ]
机构
[1] Peking Univ, Sch Math Sci, Beijing 100871, Peoples R China
[2] Harvard Univ, Dept Stat, Cambridge, MA 02138 USA
[3] CALTECH, Appl & Computat Math, Pasadena, CA 91125 USA
基金
美国国家科学基金会;
关键词
variable selection; false discovery rate (FDR); group variable selection; knockoff filter; linear regression; FALSE DISCOVERY RATE;
D O I
10.1093/imaiai/iaz012
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
In many applications, we need to study a linear regression model that consists of a response variable and a large number of potential explanatory variables, and determine which variables are truly associated with the response. In Foygel Barber & Candes (2015, Ann. Statist., 43, 2055-2085), the authors introduced a new variable selection procedure called the knockoff filter to control the false discovery rate (FDR) and proved that this method achieves exact FDR control. In this paper, we propose a prototype knockoff filter for group selection by extending the Reid-Tibshirani (2016, Biostatistics, 17, 364-376) prototype method. Our prototype knockoff filter improves the computational efficiency and statistical power of the Reid-Tibshirani prototype method when it is applied for group selection. In some cases when the group features are spanned by one or a few hidden factors, we demonstrate that the Principal Component Analysis (PCA) prototype knockoff filter outperforms the Dai-Foygel Barber (2016, 33rd International Conference on Machine Learning (ICML 2016)) group knockoff filter. We present several numerical experiments to compare our prototype knockoff filter with the Reid-Tibshirani prototype method and the group knockoff filter. We have also conducted some analysis of the knockoff filter. Our analysis reveals that some knockoff path method statistics, including the Lasso path statistic, may lead to loss of power for certain design matrices and a specially designed response even if their signal strengths are still relatively strong.
引用
收藏
页码:271 / 288
页数:18
相关论文
共 50 条
  • [41] A PROTOTYPE KNOWLEDGE-BASED SYSTEM (KBS) FOR SELECTION OF INVENTORY CONTROL POLICIES
    LUXHOJ, JT
    AGNIHOTRI, D
    KAZUNAS, S
    NAMBIAR, S
    INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH, 1993, 31 (07) : 1709 - 1720
  • [42] Trade-off between predictive performance and FDR control for high-dimensional Gaussian model selection
    Lacroix, Perrine
    Martin, Marie-Laure
    ELECTRONIC JOURNAL OF STATISTICS, 2024, 18 (02): : 2886 - 2930
  • [43] Two simple sufficient conditions for FDR control
    Blanchard, Gilles
    Roquain, Etienne
    ELECTRONIC JOURNAL OF STATISTICS, 2008, 2 : 963 - 992
  • [44] Prototype for the selection of optimal tools
    Badjgholi, Farsad
    Kittl, Burkhard
    2002, Carl Hanser Verlag (97):
  • [45] PROTOTYPE SELECTION FOR INTERPRETABLE CLASSIFICATION
    Bien, Jacob
    Tibshirani, Robert
    ANNALS OF APPLIED STATISTICS, 2011, 5 (04): : 2403 - 2424
  • [46] FILTER SELECTION
    PORTER, HF
    FLOOD, JE
    RENNIE, FW
    CHEMICAL ENGINEERING, 1971, 78 (04) : 39 - &
  • [47] On the performance of FDR control: Constraints and a partial solution
    Chi, Zhiyi
    ANNALS OF STATISTICS, 2007, 35 (04): : 1409 - 1431
  • [48] FDR, ALA, AND MR MACLEISH - SELECTION OF LIBRARIAN OF CONGRESS, 1939
    THOMISON, D
    LIBRARY QUARTERLY, 1972, 42 (04): : 390 - &
  • [49] Revisiting feature selection for linear models with FDR and power guarantees
    Yuan, Panxu
    Feng, Sanying
    Li, Gaorong
    JOURNAL OF THE KOREAN STATISTICAL SOCIETY, 2022, 51 (04) : 1132 - 1160
  • [50] Filter selection
    Janas, JJ
    CHEMICAL PROCESSING, 1996, : 53 - 54