Rapid and optimized parallel attribute reduction based on neighborhood rough sets and MapReduce

被引:4
|
作者
Hanuman, V. K. [1 ]
Chebrolu, Srilatha [1 ]
机构
[1] Natl Inst Technol Andhra Pradesh, Dept Comp Sci & Engn, Tadepalligudem 534101, Andhra Pradesh, India
关键词
Attribute reduction; Neighborhood rough sets; MapReduce; Neighborhood information; Data preprocessing; Computational complexity; High-dimensional data; ALGORITHM; EFFICIENT;
D O I
10.1016/j.eswa.2024.125323
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Attribute reduction is a crucial step in data pre-processing and feature engineering. It is the selection of a subset of relevant data attributes to reduce the computational complexity of machine learning models and improve their performance. Neighborhood rough set (NRS) theory provides a valuable framework for attribute reduction. It leverages neighborhood information to identify non-redundant and informative attributes for data analysis and machine learning tasks. Attribute subsets based on NRS theory are highly qualitative, producing effective prediction accuracies in Euclidean space. However, existing NRS-based solutions are resource-intensive because of the large search space required for finding neighborhoods and redundant computations. To overcome these limitations, we propose the rapid and optimized attribute reduction (ROAR) algorithm that optimizes the current state-of-the-art attribute-reduction method in NRS theory. The strength of ROAR lies in its ability to accelerate computations by rapidly determining the neighborhood consistency of data samples and consequently expediting the identification of both positive and boundary regions. This efficiency significantly enhances the overall processing time for the data analysis tasks. Experimental results on 12 standard datasets demonstrate that the ROAR algorithm exhibits high efficiency by obtaining accurate reduction results with rapid response times. To ensure that the ROAR algorithm is suitable for high-dimensional datasets, we provide a parallel implementation, namely, the P-ROAR algorithm. The P-ROAR algorithm is the first parallel attribute-reduction algorithm in the classical NRS theory. Computational speeds and scalability metrics establish that P-ROAR is much faster and more scalable for datasets with an enormous attribute space. These algorithms provide a tool for handling feature reduction in data engineering without compromising accuracy and performance.
引用
收藏
页数:18
相关论文
共 50 条
  • [41] An improved attribute reduction scheme with covering based rough sets
    Wang, Changzhong
    Shao, Mingwen
    Sun, Baiqing
    Hu, Qinghua
    APPLIED SOFT COMPUTING, 2015, 26 : 235 - 243
  • [42] Attribute reduction and decision rule generation based on rough sets
    Xu, J
    Jin, H
    Zhang, H
    PROCEEDINGS OF THE 11TH JOINT INTERNATIONAL COMPUTER CONFERENCE, 2005, : 505 - 508
  • [43] Attribute reduction based on interval-set rough sets
    Chunge Ren
    Ping Zhu
    Soft Computing, 2024, 28 : 1893 - 1908
  • [44] Attribute reduction based on directional semi-neighborhood rough set
    Qian, Damo
    Liu, Keyu
    Wang, Jie
    Zhang, Shiming
    Yang, Xibei
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2024, : 2523 - 2535
  • [45] ON ATTRIBUTE REDUCTION WITH INTUITIONISTIC FUZZY ROUGH SETS
    Zhang, Zhiming
    Tian, Jingfeng
    INTERNATIONAL JOURNAL OF UNCERTAINTY FUZZINESS AND KNOWLEDGE-BASED SYSTEMS, 2012, 20 (01) : 59 - 76
  • [46] Research on Attribute Reduction Using Rough Neighborhood Model
    He, Ming
    Du, Yong-ping
    ISBIM: 2008 INTERNATIONAL SEMINAR ON BUSINESS AND INFORMATION MANAGEMENT, VOL 1, 2009, : 268 - 270
  • [47] A Fast Attribute Reduction Algorithm of Neighborhood Rough Set
    Li, Wenhua
    Xia, Shuyin
    Chen, Zizhong
    2021 13TH INTERNATIONAL CONFERENCE ON KNOWLEDGE AND SMART TECHNOLOGY (KST-2021), 2021, : 43 - 48
  • [48] ATTRIBUTE REDUCTION USING DISTANCE-BASED FUZZY ROUGH SETS
    Wang, Changzhong
    Qi, Yali
    He, Qiang
    PROCEEDINGS OF 2015 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOL. 2, 2015, : 860 - 865
  • [49] Unsupervised attribute reduction for mixed data based on fuzzy rough sets
    Yuan, Zhong
    Chen, Hongmei
    Li, Tianrui
    Yu, Zeng
    Sang, Binbin
    Luo, Chuan
    INFORMATION SCIENCES, 2021, 572 : 67 - 87
  • [50] Efficient attribute reduction based on rough sets and differential evolution algorithm
    Jing, Si-Yuan
    Yang, Jun
    2020 16TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY (CIS 2020), 2020, : 217 - 222