Extreme Learning Machine for Regression and Multiclass Classification

被引:4673
作者
Huang, Guang-Bin [1 ]
Zhou, Hongming [1 ]
Ding, Xiaojian [4 ]
Zhang, Rui [2 ,3 ]
机构
[1] Nanyang Technol Univ, Sch Elect & Elect Engn, Singapore 639798, Singapore
[2] Nanyang Technol Univ, Sch Elect & Elect Engn, Singapore 639798, Singapore
[3] NW Univ Xian, Dept Math, Xian 710069, Peoples R China
[4] Xi An Jiao Tong Univ, Sch Elect & Informat Engn, Xian 710049, Peoples R China
来源
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS | 2012年 / 42卷 / 02期
关键词
Extreme learning machine (ELM); feature mapping; kernel; least square support vector machine (LS-SVM); proximal support vector machine (PSVM); regularization network; NETWORKS; ONLINE; APPROXIMATION; CLASSIFIERS; SIZE;
D O I
10.1109/TSMCB.2011.2168604
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Due to the simplicity of their implementations, least square support vector machine (LS-SVM) and proximal support vector machine (PSVM) have been widely used in binary classification applications. The conventional LS-SVM and PSVM cannot be used in regression and multiclass classification applications directly, although variants of LS-SVM and PSVM have been proposed to handle such cases. This paper shows that both LS-SVM and PSVM can be simplified further and a unified learning framework of LS-SVM, PSVM, and other regularization algorithms referred to extreme learning machine (ELM) can be built. ELM works for the "generalized" single-hidden-layer feedforward networks (SLFNs), but the hidden layer (or called feature mapping) in ELM need not be tuned. Such SLFNs include but are not limited to SVM, polynomial network, and the conventional feedforward neural networks. This paper shows the following: 1) ELM provides a unified learning platform with a widespread type of feature mappings and can be applied in regression and multiclass classification applications directly; 2) from the optimization method point of view, ELM has milder optimization constraints compared to LS-SVM and PSVM; 3) in theory, compared to ELM, LS-SVM and PSVM achieve suboptimal solutions and require higher computational complexity; and 4) in theory, ELM can approximate any target continuous function and classify any disjoint regions. As verified by the simulation results, ELM tends to have better scalability and achieve similar (for regression and binary class cases) or much better (for multiclass cases) generalization performance at much faster learning speed (up to thousands times) than traditional SVM and LS-SVM.
引用
收藏
页码:513 / 529
页数:17
相关论文
共 56 条
[1]  
[Anonymous], 1989, STAT DATASETS
[2]  
[Anonymous], 2010, P 18 EUR S ART NEUR
[3]  
[Anonymous], 1961, PRINCIPLES NEURODYNA
[4]  
[Anonymous], 2002, Matrices: Theory and Applications
[5]   The sample complexity of pattern classification with neural networks: The size of the weights is more important than the size of the network [J].
Bartlett, PL .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1998, 44 (02) :525-536
[6]  
Blake C. L., 1998, Uci repository of machine learning databases
[7]  
Bordes A, 2005, J MACH LEARN RES, V6, P1579
[8]  
Canu S., 2005, SVM KERNEL METHODS M
[9]   A parallel mixture of SVMs for very large scale problems [J].
Collobert, R ;
Bengio, S ;
Bengio, Y .
NEURAL COMPUTATION, 2002, 14 (05) :1105-1114
[10]   SUPPORT-VECTOR NETWORKS [J].
CORTES, C ;
VAPNIK, V .
MACHINE LEARNING, 1995, 20 (03) :273-297