Hashing Supported Iterative MapReduce Based Scalable SBE Reduct Computation

被引:4
|
作者
Divya, U. Venkata [1 ]
Prasad, P. S. V. S. Sai [2 ]
机构
[1] Quadrat Insights Pvt Ltd, Hyderabad, India
[2] Univ Hyderabad, Sch Comp & Informat Sci, Hyderabad, India
关键词
Rough Sets; Reduct; Iterative MapReduce; Apache Spark; Scalable feature selection;
D O I
10.1007/978-3-319-72344-0_13
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Feature Selection plays a major role in preprocessing stage of Data mining and helps in model construction by recognizing relevant features. Rough Sets has emerged in recent years as an important paradigm for feature selection i.e. finding Reduct of conditional attributes in given data set. Two control strategies for Reduct Computation are Sequential Forward Selection (SFS), Sequential Backward Elimination(SBE). With the objective of scalable feature seletion, several MapReduce based approaches were proposed in literature. All these approaches are SFS based and results in super set of reduct i.e. with redundant attributes. Even though SBE approaches results in exact Reduct, it requires lot of data movement in shuffle and sort phase of MapReduce. To overcome this problem and to optimize the network bandwidth utilization, a novel hashing supported SBE Reduct algorithm(MRSBER Hash) is proposed in this work and implemented using Iterative MapReduce framework of Apache Spark. Experiments conducted on large benchmark decision systems have empirically established the relevance of proposed approach for decision systems with large cardinality of conditional attributes.
引用
收藏
页码:163 / 170
页数:8
相关论文
共 50 条
  • [1] Scalable IQRA IG Algorithm: An Iterative MapReduce Approach for Reduct Computation
    Prasad, P. S. V. S. Sai
    Subrahmanyam, H. Bala
    Singh, Praveen Kumar
    DISTRIBUTED COMPUTING AND INTERNET TECHNOLOGY, (ICDCIT 2017), 2017, 10109 : 58 - 69
  • [2] Scalable Quick Reduct Algorithm - Iterative MapReduce Approach
    Singh, Praveen Kumar
    Prasad, P. S. V. S. Sai
    PROCEEDINGS OF THE THIRD ACM IKDD CONFERENCE ON DATA SCIENCES (CODS), 2016,
  • [3] MR_IMQRA: An Efficient MapReduce Based Approach for Fuzzy Decision Reduct Computation
    Bandagar, Kiran
    Sowkuntla, Pandu
    Moiz, Salman Abdul
    Prasad, P. S. V. S. Sai
    PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PREMI 2019, PT I, 2019, 11941 : 306 - 316
  • [4] Scalable Maximum Clique Computation Using MapReduce
    Xiang, Jingen
    Guo, Cong
    Aboulnaga, Ashraf
    2013 IEEE 29TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2013, : 74 - 85
  • [5] Iterative Computation of Connected Graph Components with MapReduce
    Lars Kolb
    Ziad Sehili
    Erhard Rahm
    Datenbank-Spektrum, 2014, 14 (2) : 107 - 117
  • [6] AdaHash: hashing-based scalable, adaptive hierarchical clustering of streaming data on Mapreduce frameworks
    Teffer, Dean
    Srinivasan, Ravi
    Ghosh, Joydeep
    INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2019, 8 (03) : 257 - 267
  • [7] AdaHash: hashing-based scalable, adaptive hierarchical clustering of streaming data on Mapreduce frameworks
    Dean Teffer
    Ravi Srinivasan
    Joydeep Ghosh
    International Journal of Data Science and Analytics, 2019, 8 : 257 - 267
  • [8] CCF: Fast and Scalable Connected Component Computation in MapReduce
    Kardes, Hakan
    Agrawal, Siddharth
    Wang, Xin
    Sun, Ang
    2014 INTERNATIONAL CONFERENCE ON COMPUTING, NETWORKING AND COMMUNICATIONS (ICNC), 2014, : 994 - 998
  • [9] Scalable Generalized Linear Bandits: Online Computation and Hashing
    Jun, Kwang-Sung
    Bhargava, Aniruddha
    Nowak, Robert
    Willett, Rebecca
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [10] FPGA in Rough Set Based Core and Reduct Computation
    Grzes, Tomasz
    Kopczynski, Maciej
    Stepaniuk, Jaroslaw
    ROUGH SETS AND KNOWLEDGE TECHNOLOGY: 8TH INTERNATIONAL CONFERENCE, 2013, 8171 : 263 - 270