A parallel algorithm for subset selection

被引:1
|
作者
Poston, WL
Wegman, EJ
Solka, JL
机构
[1] USN, Ctr Surface Warfare, Dahlgren Div, Dahlgren, VA 22448 USA
[2] George Mason Univ, Ctr Computat Stat, Fairfax, VA 22030 USA
关键词
parallel subset selection; information matrix; effective independence distribution; hat matrix;
D O I
10.1080/00949659808811869
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Prior to performing an analysis of a large data set, it is often desirable to process a subset of the data only. Current methods of subset selection choose points in a random manner, which can lead to poor solutions. The method for selection described in this paper employs the Effective Independence Distribution (EID) method that chooses observations that optimize the determinant of the information matrix. Since the method requires repeated calculations of three matrix multiplications and a matrix inverse, it is computationally intensive for extremely large data sets. A recursive form of the EID is developed here which is suitable for parallelization. The parallel method is described in detail, and load balancing and communication issues are addressed. Implementation results on the Intel Paragon show that this is an effective parallel algorithm.
引用
收藏
页码:1 / 17
页数:17
相关论文
共 50 条
  • [1] A Parallel Evolutionary Algorithm for Subset Selection in Causal Inference Models
    Cho, Wendy K. Tam
    Liu, Yan Y.
    PROCEEDINGS OF XSEDE16: DIVERSITY, BIG DATA, AND SCIENCE AT SCALE, 2016,
  • [2] The COMPSET Algorithm for Subset Selection
    Hamo, Yaniv
    Markovitch, Shaul
    19TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI-05), 2005, : 728 - 733
  • [3] THE FEATURE SUBSET SELECTION ALGORITHM
    Liu Yongguo Li Xueming Wu Zhongfu (Department of Computer Science and Engineering
    JournalofElectronics(China), 2003, (01) : 57 - 61
  • [4] The Viterbi Algorithm for Subset Selection
    Maymon, Shay
    Eldar, Yonina C.
    IEEE SIGNAL PROCESSING LETTERS, 2015, 22 (05) : 524 - 528
  • [5] Median Selection Subset Aggregation for Parallel Inference
    Wang, Xiangyu
    Peng, Peichao
    Dunson, David B.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 27 (NIPS 2014), 2014, 27
  • [6] Algorithm for the optimal feature subset selection
    Zhu, Ming
    Wang, Junpu
    Cai, Qingsheng
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 35 (09): : 803 - 805
  • [7] Genetic algorithm guided selection: Variable selection and subset selection
    Cho, SJ
    Hermsmeier, MA
    JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2002, 42 (04): : 927 - 936
  • [8] A PARALLEL SELECTION ALGORITHM
    GUPTA, P
    BHATTACHARJEE, GP
    BIT, 1984, 24 (03): : 274 - 287
  • [9] Fast Parallel Algorithms for Statistical Subset Selection Problems
    Qian, Sharon
    Singer, Yaron
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [10] IMPROVED FORWARD FLOATING SELECTION ALGORITHM FOR FEATURE SUBSET SELECTION
    Nakariyakul, Songyot
    Casasent, David P.
    PROCEEDINGS OF 2008 INTERNATIONAL CONFERENCE ON WAVELET ANALYSIS AND PATTERN RECOGNITION, VOLS 1 AND 2, 2008, : 793 - +