Instance-based entropy fuzzy support vector machine for imbalanced data

被引:0
|
作者
Poongjin Cho
Minhyuk Lee
Woojin Chang
机构
[1] Seoul National University,Department of Industrial Engineering
[2] Samsung Electronics,Big Data Analytics Group, Mobile Communications Business
来源
关键词
Fuzzy support vector machine; Imbalanced dataset; Entropy; Pattern recognition; Nearest neighbor;
D O I
暂无
中图分类号
学科分类号
摘要
Imbalanced classification has been a major challenge for machine learning because many standard classifiers mainly focus on balanced datasets and tend to have biased results toward the majority class. We modify entropy fuzzy support vector machine (EFSVM) and introduce instance-based entropy fuzzy support vector machine (IEFSVM). Both EFSVM and IEFSVM use the entropy information of k-nearest neighbors to determine the fuzzy membership value for each sample which prioritizes the importance of each sample. IEFSVM considers the diversity of entropy patterns for each sample when increasing the size of neighbors, k, while EFSVM uses single entropy information of the fixed size of neighbors for all samples. By varying k, we can reflect the component change of sample’s neighbors from near to far distance in the determination of fuzzy value membership. Numerical experiments on 35 public and 12 real-world imbalanced datasets are performed to validate IEFSVM, and area under the receiver operating characteristic curve (AUC) is used to compare its performance with other SVMs and machine learning methods. IEFSVM shows a much higher AUC value for datasets with high imbalance ratio, implying that IEFSVM is effective in dealing with the class imbalance problem.
引用
收藏
页码:1183 / 1202
页数:19
相关论文
共 50 条
  • [21] Performance of Support Vector Machine in Imbalanced Data Set
    Novakovic, Jasmina
    Markovic, Suzana
    2020 19TH INTERNATIONAL SYMPOSIUM INFOTEH-JAHORINA (INFOTEH), 2020,
  • [22] Support vector machine classification trees based on fuzzy entropy of classification
    Harrington, Peter de Boves
    ANALYTICA CHIMICA ACTA, 2017, 954 : 14 - 21
  • [23] Data reduction for instance-based learning using entropy-based partitioning
    Son, SH
    Kim, JY
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2006, PT 3, 2006, 3982 : 590 - 599
  • [24] RIONIDA: A novel algorithm for imbalanced data combining instance-based learning and rule induction
    Gora, Grzegorz
    Skowron, Andrzej
    INFORMATION SCIENCES, 2025, 708
  • [25] RIONIDA: A Novel Algorithm for Imbalanced Data Combining Instance-Based Learning and Rule Induction
    Gora, Grzegorz
    Skowron, Andrzej
    ROUGH SETS, PT I, IJCRS 2024, 2024, 14839 : 201 - 219
  • [26] Imbalanced Data Classification Based on Hybrid Resampling and Twin Support Vector Machine
    Cao, Lu
    Shen, Hong
    COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2017, 14 (03) : 579 - 595
  • [27] Imbalanced data classification based on scaling kernel-based support vector machine
    Yong Zhang
    Panpan Fu
    Wenzhe Liu
    Guolong Chen
    Neural Computing and Applications, 2014, 25 : 927 - 935
  • [28] Imbalanced data classification based on scaling kernel-based support vector machine
    Zhang, Yong
    Fu, Panpan
    Liu, Wenzhe
    Chen, Guolong
    NEURAL COMPUTING & APPLICATIONS, 2014, 25 (3-4): : 927 - 935
  • [29] Support vector machine for classification based on fuzzy training data
    Ji, Ai-bing
    Pang, Jia-hong
    Qiu, Hong-jie
    EXPERT SYSTEMS WITH APPLICATIONS, 2010, 37 (04) : 3495 - 3498
  • [30] Support vector machine for classification based on fuzzy training data
    Ji, Ai-Bing
    Pang, Jia-Hong
    Li, Shu-Huan
    Sun, Jian-Ping
    PROCEEDINGS OF 2006 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2006, : 1609 - +