Design of SPRINT Parallelization of Data Mining Algorithms Based on Cloud Computing

被引:0
|
作者
Song, Lei [1 ]
Zhang, Huajie [2 ]
Feng, Dongdong [3 ]
机构
[1] Kaifeng Vocat Coll Culture & Arts, Modern Educ Ctr, Kaifeng 475004, Peoples R China
[2] Zhengzhou Univ Technol, Engn Training Ctr, Zhengzhou 450044, Peoples R China
[3] Henan Univ, Sch Software, Kaifeng 475004, Peoples R China
关键词
data mining; cloud computing; SPRINT algorithm; parallel design; ENERGY; PREDICTION;
D O I
暂无
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
For traditional data mining, all data shall be loaded into memory for analysis and calculation. It belongs to a stand-alone computing mode, which has low calculation efficiency, and a high mining failure rate during the work process. As the data storage and computer technology develop rapidly, how to store and process big data effectively has become an important problem to be solved. Cloud computing can quickly obtain resources from the computing resource pool, and implement parallel improvement of data mining algorithms, which can achieve an efficient combination of cloud computing platform and data mining, and effectively make up for the bottlenecks faced by traditional data mining processes. Therefore, based on the Hadoop cloud computing platform, this paper makes full use of the characteristics of the MapReduce programming framework, and proposes a parallel design of decision tree nodes, node attribute metrics, and Gini index ranking for the SPRINT decision tree algorithm. The performance of the parallelized SPRINT algorithm on classification accuracy, scalability, and speedup ratio is tested. The results indicate that the parallel design of the SPRINT algorithm can obtain good scalability and parallel speedup under the premise of ensuring classification accuracy, which verifies the feasibility of the parallel design of data mining algorithms on the basis of cloud computing.
引用
收藏
页码:399 / 405
页数:7
相关论文
共 50 条
  • [41] Data Stream Mining Based-Outlier Prediction for Cloud Computing
    Souiden, Imen
    Brahmi, Zaki
    Lafi, Lamine
    DIGITAL ECONOMY: EMERGING TECHNOLOGIES AND BUSINESS INNOVATION, ICDEC 2017, 2017, 290 : 131 - 142
  • [42] Delivering Data Mining Services in Cloud Computing
    Parra-Royon, Manuel
    Benitez, Jose M.
    2019 IEEE WORLD CONGRESS ON SERVICES (IEEE SERVICES 2019), 2019, : 396 - 397
  • [43] On the Use of Cloud Computing for Big Data Mining
    Talia, Domenico
    2017 6TH INTERNATIONAL CONFERENCE ON RELIABILITY, INFOCOM TECHNOLOGIES AND OPTIMIZATION (TRENDS AND FUTURE DIRECTIONS) (ICRITO), 2017, : 39 - 40
  • [44] <bold>Data mining in Cloud Computing </bold>
    Geng, Xia
    Yang, Zhi
    PROCEEDINGS OF 2013 INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND COMPUTER APPLICATIONS (ICSA 2013), 2013, 92 : 1 - 7
  • [45] Semantics of Data Mining Services in Cloud Computing
    Parra-Royon, Manuel
    Atemezing, Ghislain
    Benitez, Jose M.
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2022, 15 (02) : 945 - 955
  • [46] Intelligent computing system based on pattern recognition and data mining algorithms
    Zhang, Junlin
    Williams, Samuel Oluwarotimi
    Wang, Haoxiang
    SUSTAINABLE COMPUTING-INFORMATICS & SYSTEMS, 2018, 20 : 192 - 202
  • [47] Big Data Cleaning Algorithms in Cloud Computing
    Feng, Zhang
    Hui-Feng, Xue
    Dong-Sheng, Xu
    Yong-Heng, Zhang
    Fei, You
    INTERNATIONAL JOURNAL OF ONLINE ENGINEERING, 2013, 9 (03) : 77 - 81
  • [48] Big Data Mining Algorithms for Fog Computing
    Fong, Simon
    INTERNATIONAL CONFERENCE ON BIG DATA AND INTERNET OF THINGS (BDIOT 2017), 2017, : 57 - 61
  • [49] An Overview of Data Security Algorithms in Cloud Computing
    Amalarethinam, D. I. George
    Rajakumari, S. Edel Josephine
    APPLIED INTELLIGENCE AND INFORMATICS, AII 2023, 2024, 2065 : 355 - 367
  • [50] Compiler and runtime support for shared memory parallelization of data mining algorithms
    Li, XG
    Jin, RM
    Agrawal, G
    LANGUAGES AND COMPILERS FOR PARALLEL COMPUTING, 2005, 2481 : 265 - 279