Stochastic parallel extreme artificial hydrocarbon networks: An implementation for fast and robust supervised machine learning in high-dimensional data

被引:17
|
作者
Ponce, Hiram [1 ]
de Campos Souza, Paulo V. [2 ]
Guimaraes, Augusto Junio [2 ]
Gonzalez-Mora, Guillermo [1 ]
机构
[1] Univ Panamer, Fac Ingn, Augusto Rodin 498, Mexico City 03920, DF, Mexico
[2] Fac UNA Betim, Av Gov Valadares 640, BR-32510010 Betim, MG, Brazil
关键词
Machine learning; Parallel computing; Extreme learning machines; Stochastic learning; Regression; Classification; Big data; PARTICLE SWARM OPTIMIZATION; GRADIENT DESCENT; ALGORITHMS;
D O I
10.1016/j.engappai.2019.103427
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Artificial hydrocarbon networks (AHN) a supervised learning method inspired on organic chemical structures and mechanisms - have shown improvements in predictive power and interpretability in comparison with other well-known machine learning models. However, AHN are very time-consuming that are not able to deal with large data until now. In this paper, we introduce the stochastic parallel extreme artificial hydrocarbon networks (SPE-AHN), an algorithm for fast and robust training of supervised AHN models in high-dimensional data. This training method comprises a population-based meta-heuristic optimization with defined individual encoding and objective function related to the AHN-model, an implementation in parallel-computing, and a stochastic learning approach for consuming large data. We conducted three experiments with synthetic and real data sets to validate the training execution time and performance of the proposed algorithm. Experimental results demonstrated that the proposed SPE-AHN outperforms the original-AHN method, increasing the speed of training more than 10,000x times in the worst case scenario. Additionally, we present two case studies in real data sets for solar-panel deployment prediction (regression problem), and human falls and daily activities classification in healthcare monitoring systems (classification problem). These case studies showed that SPE-AHN improves the state-of-the-art machine learning models in both engineering problems. We anticipate our new training algorithm to be useful in many applications of AHN like robotics, finance, medical engineering, aerospace, and others, in which large amounts of data (e.g. big data) is essential.
引用
收藏
页数:16
相关论文
共 50 条
  • [21] Adaptively-Accelerated Parallel Stochastic Gradient Descent for High-Dimensional and Incomplete Data Representation Learning
    Qin, Wen
    Luo, Xin
    Zhou, Mengchu
    IEEE TRANSACTIONS ON BIG DATA, 2024, 10 (01) : 92 - 107
  • [22] Fast and Robust Supervised Learning in High Dimensions Using the Geometry of the Data
    Mukherjee, Ujjal Kumar
    Majumdar, Subhabrata
    Chatterjee, Snigdhansu
    ADVANCES IN DATA MINING: APPLICATIONS AND THEORETICAL ASPECTS, ICDM 2015, 2015, 9165 : 109 - 123
  • [23] A hierarchical structure of extreme learning machine (HELM) for high-dimensional datasets with noise
    He, Yan-Lin
    Geng, Zhi-Qiang
    Xu, Yuan
    Zhu, Qun-Xiong
    NEUROCOMPUTING, 2014, 128 : 407 - 414
  • [24] Extreme Learning Machine on High Dimensional and Large Data Applications
    Lin, Zhiping
    Cao, Jiuwen
    Chen, Tao
    Jin, Yi
    Sun, Zhan-Li
    Lendasse, Amaury
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2015, 2015
  • [25] eQuant - A Server for Fast Protein Model Quality Assessment by Integrating High-Dimensional Data and Machine Learning
    Bittrich, Sebastian
    Heinke, Florian
    Labudde, Dirk
    BEYOND DATABASES, ARCHITECTURES AND STRUCTURES, BDAS 2016, 2016, 613 : 419 - 433
  • [26] Unsupervised Artificial Neural Networks for Outlier Detection in High-Dimensional Data
    Popovic, Daniel
    Fouche, Edouard
    Boehm, Klemens
    ADVANCES IN DATABASES AND INFORMATION SYSTEMS, ADBIS 2019, 2019, 11695 : 3 - 19
  • [27] On the challenges of learning with inference networks on sparse, high-dimensional data
    Krishnan, Rahul G.
    Liang, Dawen
    Hoffman, Matthew D.
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 84, 2018, 84
  • [28] Learning Gene Regulatory Networks with High-Dimensional Heterogeneous Data
    Jia, Bochao
    Liang, Faming
    NEW FRONTIERS OF BIOSTATISTICS AND BIOINFORMATICS, 2018, : 305 - 327
  • [29] Fast outlier detection for high-dimensional data of wireless sensor networks
    Qiao, Yan
    Cui, Xinhong
    Jin, Peng
    Zhang, Wu
    INTERNATIONAL JOURNAL OF DISTRIBUTED SENSOR NETWORKS, 2020, 16 (10)
  • [30] On safari to Random Jungle: a fast implementation of Random Forests for high-dimensional data
    Schwarz, Daniel F.
    Koenig, Inke R.
    Ziegler, Andreas
    BIOINFORMATICS, 2010, 26 (14) : 1752 - 1758