Fine Particulate Matter Concentration Level Prediction by using Tree-based Ensemble Classification Algorithms

被引:0
|
作者
Zhao, Yin [1 ]
Abu Hasan, Yahya [1 ]
机构
[1] Univ Sains Malaysia, Sch Math Sci, George Town, Penang, Malaysia
关键词
Random Forest; C5.0; PM2.5; prediction; data mining;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Pollutant forecasting is an important problem in the environmental sciences. Data mining is an approach to discover knowledge from large data. This paper tries to use data mining methods to forecast PM2.5 concentration level, which is an important air pollutant. There are several tree-based classification algorithms available in data mining, such as CART, C4.5, Random Forest (RF) and C5.0. RF and C5.0 are popular ensemble methods, which are, RF builds on CART with Bagging and C5.0 builds on C4.5 with Boosting, respectively. This paper builds PM2.5 concentration level predictive models based on RF and C5.0 by using R packages. The data set includes 2000-2011 period data in a new town of Hong Kong. The PM2.5 concentration is divided into 2 levels, the critical points is 25 mu g/m(3)(24 hours mean). According to 100 times 10-fold cross validation, the best testing accuracy is from RF model, which is around 0.845 similar to 0.854.
引用
收藏
页码:21 / 27
页数:7
相关论文
共 50 条
  • [21] Enhancing the performance of decision tree-based packet classification algorithms using CPU cluster
    Mahdi Abbasi
    Aazad Shokrollahi
    Cluster Computing, 2020, 23 : 3203 - 3219
  • [22] Application of decision tree-based ensemble learning in the classification of breast cancer
    Ghiasi, Mohammad M.
    Zendehboudi, Sohrab
    COMPUTERS IN BIOLOGY AND MEDICINE, 2021, 128
  • [23] Predictive analytics for blood glucose concentration: an empirical study using the tree-based ensemble approach
    Liu, Jiaming
    Wang, Liuan
    Zhang, Linan
    Zhang, Zeming
    Zhang, Sicheng
    LIBRARY HI TECH, 2020, 38 (04) : 835 - 858
  • [24] Enhancing the performance of decision tree-based packet classification algorithms using CPU cluster
    Abbasi, Mahdi
    Shokrollahi, Aazad
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2020, 23 (04): : 3203 - 3219
  • [25] Prediction performance of improved decision tree-based algorithms: a review
    Mienye, Ibomoiye Domor
    Sun, Yanxia
    Wang, Zenghui
    2ND INTERNATIONAL CONFERENCE ON SUSTAINABLE MATERIALS PROCESSING AND MANUFACTURING (SMPM 2019), 2019, 35 : 698 - 703
  • [26] Evaluating Tree-based Ensemble Strategies for Imbalanced Network Attack Classification
    Soon, Hui Fern
    Amir, Amiza
    Nishizaki, Hiromitsu
    Zahri, Nik Adilah Hanin
    Kamarudin, Latifah Munirah
    Azemi, Saidatul Norlyana
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (01) : 1124 - 1134
  • [27] Building classification models from microarray data with tree-based classification algorithms
    Tan, Peter J.
    Dowe, David L.
    Dix, Trevor I.
    AI 2007: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2007, 4830 : 589 - 598
  • [28] A tree-based intelligence ensemble approach for spatial prediction of potential groundwater
    Avand, Mohammadtaghi
    Janizadeh, Saeid
    Tien Bui, Dieu
    Pham, Viet Hoa
    Ngo, Phuong Thao T.
    Nhu, Viet-Ha
    INTERNATIONAL JOURNAL OF DIGITAL EARTH, 2020, 13 (12) : 1408 - 1429
  • [29] Prediction of robo-advisory acceptance in banking services using tree-based algorithms
    Orzeszko, Witold
    Piotrowski, Dariusz
    PLOS ONE, 2024, 19 (05):
  • [30] An improved Stacking framework for stock index prediction by leveraging tree-based ensemble models and deep learning algorithms
    Jiang, Minqi
    Liu, Jiapeng
    Zhang, Lu
    Liu, Chunyu
    PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2020, 541