A parallel algorithm for subset selection

被引:1
|
作者
Poston, WL
Wegman, EJ
Solka, JL
机构
[1] USN, Ctr Surface Warfare, Dahlgren Div, Dahlgren, VA 22448 USA
[2] George Mason Univ, Ctr Computat Stat, Fairfax, VA 22030 USA
关键词
parallel subset selection; information matrix; effective independence distribution; hat matrix;
D O I
10.1080/00949659808811869
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Prior to performing an analysis of a large data set, it is often desirable to process a subset of the data only. Current methods of subset selection choose points in a random manner, which can lead to poor solutions. The method for selection described in this paper employs the Effective Independence Distribution (EID) method that chooses observations that optimize the determinant of the information matrix. Since the method requires repeated calculations of three matrix multiplications and a matrix inverse, it is computationally intensive for extremely large data sets. A recursive form of the EID is developed here which is suitable for parallelization. The parallel method is described in detail, and load balancing and communication issues are addressed. Implementation results on the Intel Paragon show that this is an effective parallel algorithm.
引用
收藏
页码:1 / 17
页数:17
相关论文
共 50 条
  • [21] BRANCH AND BOUND ALGORITHM FOR FEATURE SUBSET SELECTION
    NARENDRA, P
    FUKUNAGA, K
    IEEE TRANSACTIONS ON COMPUTERS, 1977, 26 (09) : 917 - 922
  • [22] A boosting algorithm with subset selection of training patterns
    Nakashima, T
    Nakai, G
    Ishibuchi, H
    PROCEEDINGS OF THE 12TH IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1 AND 2, 2003, : 690 - 695
  • [23] A hierarchy reduct algorithm for feature subset selection
    Qu, BB
    Lu, YS
    PROCEEDINGS OF THE 2004 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2004, : 1157 - 1161
  • [24] Feature subset selection based on the genetic algorithm
    Yang, Jingwei
    Wang, Sile
    Chen, Yingyi
    Lu, Sukui
    Yang, Wenzhu
    ADVANCED TECHNOLOGIES IN MANUFACTURING, ENGINEERING AND MATERIALS, PTS 1-3, 2013, 774-776 : 1532 - +
  • [25] AN OPTIMAL ALGORITHM FOR PARALLEL SELECTION
    AKL, SG
    INFORMATION PROCESSING LETTERS, 1984, 19 (01) : 47 - 50
  • [26] Parallel algorithm of selecting the best subset on Cp criterion
    Hu, Qingjun
    Wu, Yi
    Guofang Keji Daxue Xuebao/Journal of National University of Defense Technology, 1993, 15 (02):
  • [27] Solving feature subset selection problem by a Parallel Scatter Search
    López, FG
    Torres, MG
    Batista, BM
    Pérez, JAM
    Moreno-Vega, JM
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2006, 169 (02) : 477 - 489
  • [28] An extended DEIM algorithm for subset selection and class identification
    Hendryx, Emily P.
    Riviere, Beatrice M.
    Rusin, Craig G.
    MACHINE LEARNING, 2021, 110 (04) : 621 - 650
  • [29] Genetic algorithm with fuzzy operators for feature subset selection
    Chakraborty, B
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2002, E85A (09) : 2089 - 2092
  • [30] An extended DEIM algorithm for subset selection and class identification
    Emily P. Hendryx
    Béatrice M. Rivière
    Craig G. Rusin
    Machine Learning, 2021, 110 : 621 - 650