A Splicing Approach to Best Subset of Groups Selection

被引:10
|
作者
Zhang, Yanhang [1 ,2 ]
Zhu, Junxian [3 ]
Zhu, Jin [2 ]
Wang, Xueqin [4 ]
机构
[1] Renmin Univ China, Sch Stat, Beijing 100872, Peoples R China
[2] Sun Yat Sen Univ, Southern China Ctr Stat Sci, Sch Math, Dept Stat Sci, Guangzhou 510275, Peoples R China
[3] Natl Univ Singapore, Saw Swee Hock Sch Publ Hlth, Singapore 117549, Singapore
[4] Univ Sci & Technol China, Int Inst Finance, Sch Management, Dept Stat & Finance, Hefei 230026, Peoples R China
基金
中国国家自然科学基金;
关键词
best subset of groups selection; group splicing; group information criterion; selection consistency of subset of groups; polynomial computational complexity; VARIABLE SELECTION; GENE-EXPRESSION; MODEL SELECTION; REGRESSION; SPARSITY; LASSO; RECOVERY; SIGNALS; UNION;
D O I
10.1287/ijoc.2022.1241
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Best subset of groups selection (BSGS) is the process of selecting a small part of nonoverlapping groups to achieve the best interpretability on the response variable. It has attracted increasing attention and has far-reaching applications in practice. However, due to the computational intractability of BSGS in high-dimensional settings, developing efficient algorithms for solving BSGS remains a research hotspot. In this paper, we propose a group -splicing algorithm that iteratively detects the relevant groups and excludes the irrelevant ones. Moreover, coupled with a novel group information criterion, we develop an adaptive algorithm to determine the optimal model size. Under certain conditions, it is certifiable that our algorithm can identify the optimal subset of groups in polynomial time with high probability. Finally, we demonstrate the efficiency and accuracy of our methods by compar-ing them with several state-of-the-art algorithms on both synthetic and real-world data sets.
引用
收藏
页码:104 / 119
页数:17
相关论文
共 50 条
  • [31] A fast metric approach to feature subset selection
    Chan, TYT
    24TH EUROMICRO CONFERENCE - PROCEEDING, VOLS 1 AND 2, 1998, : 733 - 736
  • [32] Towards a Better Feature Subset Selection Approach
    Shiba, Omar A. A.
    PROCEEDINGS OF KNOWLEDGE MANAGEMENT 5TH INTERNATIONAL CONFERENCE 2010, 2010, : 675 - 678
  • [33] A SUBSET SUM APPROACH TO COIL SELECTION FOR SLITTING
    Han, Yune T.
    Chang, Soo Y.
    INTERNATIONAL JOURNAL OF INDUSTRIAL ENGINEERING-THEORY APPLICATIONS AND PRACTICE, 2015, 22 (03): : 343 - 353
  • [34] abess: A Fast Best-Subset Selection Library in Python and R
    Zhu, Jin
    Wang, Xueqin
    Hu, Liyuan
    Huang, Junhao
    Jiang, Kangkang
    Zhang, Yanhang
    Lin, Shiyun
    Zhu, Junxian
    Journal of Machine Learning Research, 2022, 23
  • [35] Online selection of the best k-feature subset for object tracking
    Li, Guorong
    Huang, Qingming
    Pang, Junbiao
    Jiang, Shuqiang
    Qin, Lei
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2012, 23 (02) : 254 - 263
  • [36] Best subset feature selection for massive mixed-type problems
    Tuv, Eugene
    Borisov, Alexander
    Torkkola, Kari
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2006, PROCEEDINGS, 2006, 4224 : 1048 - 1056
  • [37] Subset Selection of the Best Treatment via Many-One Tests
    Commun Stat Part A Theory Methods, 6 (1335):
  • [38] Subset selection of the best treatment via many-one tests
    Horn, M
    Vollandt, R
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 1996, 25 (06) : 1335 - 1349
  • [39] SELECTING A SUBSET CONTAINING BEST POPULATION - BAYESIAN-APPROACH
    GOEL, PK
    RUBIN, H
    ANNALS OF STATISTICS, 1977, 5 (05): : 969 - 983
  • [40] Analyzing distributed Spark MLlib regression algorithms for accuracy, execution efficiency and scalability using best subset selection approach
    Sewal, Piyush
    Singh, Hari
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (15) : 44047 - 44066