Independence screening methods such as the two-sample t-test and the marginal correlation based ranking are among the most widely used techniques for variable selection in ultrahigh-dimensional data sets. In this short note, simple examples are used to demonstrate potential problems with the independence screening methods in the presence of correlated predictors. Also, an example is considered where all important variables are independent among themselves and all but one important variables are independent with the unimportant variables. Furthermore, a real data example from a genome-wide association study is used to illustrate inferior performance of marginal correlation screening compared to another screening method.
机构:
Beijing Univ Technol, Coll Appl Sci, Beijing 100124, Peoples R ChinaBeijing Univ Technol, Coll Appl Sci, Beijing 100124, Peoples R China
Li, Gaorong
Peng, Heng
论文数: 0引用数: 0
h-index: 0
机构:
Hong Kong Baptist Univ, Dept Math, Hong Kong, Hong Kong, Peoples R ChinaBeijing Univ Technol, Coll Appl Sci, Beijing 100124, Peoples R China
Peng, Heng
Zhang, Jun
论文数: 0引用数: 0
h-index: 0
机构:
Shenzhen Univ, Shen Zhen Hong Kong Joint Res Ctr Appl Stat, Shenzhen 518060, Peoples R ChinaBeijing Univ Technol, Coll Appl Sci, Beijing 100124, Peoples R China
Zhang, Jun
Zhu, Lixing
论文数: 0引用数: 0
h-index: 0
机构:
Hong Kong Baptist Univ, Dept Math, Hong Kong, Hong Kong, Peoples R ChinaBeijing Univ Technol, Coll Appl Sci, Beijing 100124, Peoples R China
Zhu, Lixing
ANNALS OF STATISTICS,
2012,
40
(03):
: 1846
-
1877
机构:
S China Univ Technol, Ctr Control Optimizat, Coll Automat Sci & Engn, Guangzhou 510640, Peoples R ChinaS China Univ Technol, Ctr Control Optimizat, Coll Automat Sci & Engn, Guangzhou 510640, Peoples R China