Performance Analysis of Data Mining Algorithms Based on PCA

被引:0
|
作者
Bai, Ruifeng [1 ]
Wang, Jie [2 ]
Yang, Lin [2 ]
Pan, Jingchang [2 ]
机构
[1] Shandong Univ, Coll Business, Weihai 264209, Peoples R China
[2] Shandong Univ, Scholl Mech Elect & Informat Engn, Weihai 264209, Peoples R China
关键词
PCA; Classification; Clustering; Spectrum; Cataclysmic Variable Star; DIGITAL SKY SURVEY; CATACLYSMIC VARIABLES;
D O I
暂无
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Data mining algorithms behave differently under different application context. It is an important topic to find out the characteristics of the relevant algorithms. This paper studied PCA based dimension reduction and the functional performance of data mining algorithms (ANN, Bayes, KNN, K-means) under different dimension reduction rates in finding Cataclysmic Variable Stars(CVs) in a hybrid celestial spectra dataset. The dataset was selected from SDSS(Sloan Digital Sky Survey), 1417 spectra altogether. In the dataset, there are 15 CVs, along with other type of celestial bodies. ANN, Bayes, KNN and K-means were chosen to test their performances in finding CVs and time cost under different PCA dimensions. The classification accuracy and time cost were analyzed of the four mentioned algorithms in detail under different PCA dimensions. A series of experiments were done to carry out our research. Through this study, we can understand the inherent characteristics of the four algorithms and make better choices in future data mining applications.
引用
收藏
页码:1506 / 1509
页数:4
相关论文
共 50 条
  • [21] A Performance Comparison of Data Mining Algorithms Based Intrusion Detection System for Smart Grid
    El Mrabet, Zakaria
    El Ghazi, Hassan
    Kaabouch, Naima
    2019 IEEE INTERNATIONAL CONFERENCE ON ELECTRO INFORMATION TECHNOLOGY (EIT), 2019, : 298 - 303
  • [22] Analysis of Factors Influencing the Development of mHealth Innovation Based on Data Mining Algorithms
    Ma, Rui
    Liu, Bin
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2022, 2022
  • [23] Paradigm and performance analysis of distributed frequent itemset mining algorithms based on Mapreduce
    Xiao, Wen
    Hu, Juan
    MICROPROCESSORS AND MICROSYSTEMS, 2021, 82
  • [24] Data Mining in the Analysis of Tree Harvester Performance Based on Automatically Collected Data
    Polowy, Krzysztof
    Molinska-Glura, Marta
    FORESTS, 2023, 14 (01):
  • [25] Algorithms of nonmonotonic data mining based on concept hierarchy and layered mining
    Yang, Jie
    Ye, Chen-Zhou
    Li, Guo-Zheng
    Shanghai Jiaotong Daxue Xuebao/Journal of Shanghai Jiaotong University, 2001, 35 (11): : 1651 - 1654
  • [26] Preservation of Privacy in Data Mining by using PCA Based Perturbation Technique
    Gokulnath, C.
    Priyan, M. K.
    Balan, E. Vishnu
    Prabha, K. P. Rama
    Jeyanthi, R.
    2015 INTERNATIONAL CONFERENCE ON SMART TECHNOLOGIES AND MANAGEMENT FOR COMPUTING, COMMUNICATION, CONTROLS, ENERGY AND MATERIALS (ICSTM), 2015, : 202 - 206
  • [27] EVALUATION OF PREDICTIVE DATA MINING ALGORITHMS IN STUDENT ACADEMIC PERFORMANCE
    Jidagam, Rohith
    Rizk, Nouhad
    INTED2016: 10TH INTERNATIONAL TECHNOLOGY, EDUCATION AND DEVELOPMENT CONFERENCE, 2016, : 6314 - 6324
  • [29] A Kind of Classification Algorithms of Data Mining and Quantitative Analysis
    Zhang, Yanfeng
    Li, Tingting
    ENGINEERING SOLUTIONS FOR MANUFACTURING PROCESSES, PTS 1-3, 2013, 655-657 : 963 - 968
  • [30] Automated Evaluation Results Analysis With Data Mining Algorithms
    Bouarab-Dahmani, Farida
    Tahi, Razika
    PROCEEDINGS OF THE 12TH EUROPEAN CONFERENCE ON E-LEARNING (ECEL 2013), 2013, : 41 - 47