A hierarchical VQSVM for imbalanced data sets

被引：4

作者：

Yu, Ting ^{[1
]}

Jan, Tony ^{[1
]}

Simoff, Simeon ^{[1
]}

Debenham, John ^{[1
]}

机构：

[1] Univ Technol Sydney, Fac Informat Technol, Sydney, NSW 2007, Australia

来源：

2007 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-6 | 2007年

关键词：

D O I：

10.1109/IJCNN.2007.4371010

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

First, a hierarchical modelling method, VQSVM, is introduced, and some remarks are discussed. Secondly the proposed VQSVM is applied to a nonstandard learning environment, imbalanced data sets. In cases of extremely imbalanced dataset with high dimensions, standard machine learning techniques tend to be overwhelmed by the large classes. The hierarchical VQSVM contains a set of local models i.e. codevectors produced by the Vector Quantization and a global model, i.e. Support Vector Machine, to rebalance datasets without significant information loss. Some issues, e.g. distortion and support vectors, have been discussed to address the trade-off between the information loss and undersampling rate. Experiments compare VQSVM with random resampling techniques on some imbalanced datasets with varied imbalance ratios, and results show that the performance of VQSVM is superior or equivalent to random resampling techniques, especially in case of extremely imbalanced large datasets.

引用

页码：518 / 523

页数：6

共 50 条

[1] Classifying imbalanced data sets using similarity based hierarchical decomposition
Beyan, Cigdem
Fisher, Robert
PATTERN RECOGNITION, 2015, 48 (05) : 1653 - 1672
[2] Data Mining on Imbalanced Data Sets
Gu, Qiong
Cai, Zhihua
Zhu, Li
Huang, Bo
2008 INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER THEORY AND ENGINEERING, 2008, : 1020 - 1024
[3] A LEARNING METHOD FOR IMBALANCED DATA SETS
de la Calleja, Jorge
Fuentes, Olac
Gonzalez, Jesus
Aceves-Perez, Rita M.
KDIR 2009: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND INFORMATION RETRIEVAL, 2009, : 307 - +
[4] Graph Classification with Imbalanced Data Sets
Xiao, Gang-Song
Chen, Xiao-Yun
2011 FIRST ASIAN CONFERENCE ON PATTERN RECOGNITION (ACPR), 2011, : 57 - 61
[5] The Text Classification for Imbalanced Data Sets
Li, Yanling
Zhu, Yehang
Yang, Ping
ISISE 2008: INTERNATIONAL SYMPOSIUM ON INFORMATION SCIENCE AND ENGINEERING, VOL 2, 2008, : 778 - +
[6] Hierarchical fuzzy rule based classification systems with genetic rule selection for imbalanced data-sets
Fernandez, Alberto
del Jesus, Maria Jose
Herrera, Francisco
INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2009, 50 (03) : 561 - 577
[7] An evaluation of progressive sampling for imbalanced data sets
Ng, Willie
Dash, Manoranjan
ICDM 2006: SIXTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, WORKSHOPS, 2006, : 657 - +
[8] Balanced Neighborhood Classifiers for Imbalanced Data Sets
Zhu, Shunzhi
Ma, Ying
Pan, Weiwei
Zhu, Xiatian
Luo, Guangchun
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2014, E97D (12): : 3226 - 3229
[9] Classification with local clustering in imbalanced data sets
Ji, Hua
Zhang, Huaxiang
ADVANCED RESEARCH ON INFORMATION SCIENCE, AUTOMATION AND MATERIAL SYSTEM, PTS 1-6, 2011, 219-220 : 151 - 155
[10] A Supervised Learning Approach for Imbalanced Data Sets
Nguyen, Giang H.
Bouzerdoum, Abdesselam
Phung, Son L.
19TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1-6, 2008, : 3759 - 3762

← 1 2 3 4 5 →