Hashing Supported Iterative MapReduce Based Scalable SBE Reduct Computation

被引：4

作者：

Divya, U. Venkata ^{[1
]}

Prasad, P. S. V. S. Sai ^{[2
]}

机构：

[1] Quadrat Insights Pvt Ltd, Hyderabad, India

[2] Univ Hyderabad, Sch Comp & Informat Sci, Hyderabad, India

来源：

DISTRIBUTED COMPUTING AND INTERNET TECHNOLOGY (ICDCIT 2018) | 2018年 / 10722卷

关键词：

Rough Sets; Reduct; Iterative MapReduce; Apache Spark; Scalable feature selection;

D O I：

10.1007/978-3-319-72344-0_13

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Feature Selection plays a major role in preprocessing stage of Data mining and helps in model construction by recognizing relevant features. Rough Sets has emerged in recent years as an important paradigm for feature selection i.e. finding Reduct of conditional attributes in given data set. Two control strategies for Reduct Computation are Sequential Forward Selection (SFS), Sequential Backward Elimination(SBE). With the objective of scalable feature seletion, several MapReduce based approaches were proposed in literature. All these approaches are SFS based and results in super set of reduct i.e. with redundant attributes. Even though SBE approaches results in exact Reduct, it requires lot of data movement in shuffle and sort phase of MapReduce. To overcome this problem and to optimize the network bandwidth utilization, a novel hashing supported SBE Reduct algorithm(MRSBER Hash) is proposed in this work and implemented using Iterative MapReduce framework of Apache Spark. Experiments conducted on large benchmark decision systems have empirically established the relevance of proposed approach for decision systems with large cardinality of conditional attributes.

引用

页码：163 / 170

页数：8

共 50 条

[21] A Scalable Similarity Join Algorithm Based on MapReduce and LSH
Sébastien Rivault
Mostafa Bamha
Sébastien Limet
Sophie Robert
International Journal of Parallel Programming, 2022, 50 : 360 - 380
[22] MapReduce based improved quick reduct algorithm with granular refinement using vertical partitioning scheme
Sowkuntla, Pandu
Prasad, P. S. V. S. Sai
KNOWLEDGE-BASED SYSTEMS, 2020, 189
[23] A new algorithm for reduct computation based on gap elimination and attribute contribution
Rodriguez-Diez, Vladimir
Martinez-Trinidad, Jose Fco.
Carrasco-Ochoa, Jesus A.
Lazo-Cortes, Manuel S.
INFORMATION SCIENCES, 2018, 435 : 111 - 123
[24] Load Balancing in MapReduce Based on Scalable Cardinality Estimates
Gufler, Benjamin
Augsten, Nikolaus
Reiser, Angelika
Kemper, Alfons
2012 IEEE 28TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2012, : 522 - 533
[25] Scalable Fuzzy Rough Set Reduct Computation Using Fuzzy Min-Max Neural Network Preprocessing
Kumar, Anil
Prasad, P. S. V. S. Sai
IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2020, 28 (05) : 953 - 964
[26] SEMI: A Scalable Entity Matching System Based on MapReduce
Chao, Pingfu
Li, Yuming
Gao, Zhu
Fang, Junhua
He, Xiaofeng
Zhang, Rong
DATABASES THEORY AND APPLICATIONS, 2015, 9093 : 328 - 332
[27] A Scalable Similarity Join Algorithm Based on MapReduce and LSH
Rivault, Sebastien
Bamha, Mostafa
Limet, Sebastien
Robert, Sophie
INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2022, 50 (3-4) : 360 - 380
[28] A MAPREDUCE BASED DISTRIBUTED LSI FOR SCALABLE INFORMATION RETRIEVAL
Liu, Yang
Li, Maozhen
Khan, Mukhtaj
Qi, Man
COMPUTING AND INFORMATICS, 2014, 33 (02) : 259 - 280
[29] Scalable Multi-agent Simulation Based on MapReduce
Ahlbrecht, Tobias
Dix, Juergen
Fiekas, Niklas
MULTI-AGENT SYSTEMS AND AGREEMENT TECHNOLOGIES, EUMAS 2016, 2017, 10207 : 364 - 371
[30] Taming Computation Skews of Block-Oriented Iterative Scientific Applications in MapReduce Systems
Yang, Xin
Li, Min
Yu, Ze
Li, Xiaolin
2014 IEEE 7TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (CLOUD), 2014, : 176 - 183

← 1 2 3 4 5 →