Imbalance: Oversampling algorithms for imbalanced classification in R

被引:59
|
作者
Cordon, Ignacio [1 ]
Garcia, Salvador [1 ]
Fernandez, Alberto [1 ]
Herrera, Francisco [1 ]
机构
[1] Univ Granada, DaSCI Andalusian Inst Data Sci & Computat Intelli, Granada, Spain
关键词
Oversampling; Imbalanced classification; Machine learning; Preprocessing; SMOTE; SOFTWARE; SMOTE;
D O I
10.1016/j.knosys.2018.07.035
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Addressing imbalanced datasets in classification tasks is a relevant topic in research studies. The main reason is that for standard classification algorithms, the success rate when identifying minority class instances may be adversely affected. Among different solutions to cope with this problem, data level techniques have shown a robust behavior. In this paper, the novel imbalance package is introduced. Written in R and C++, and available at CRAN repository, this library includes recent relevant oversampling algorithms to improve the quality of data in imbalanced datasets, prior to performing a learning task. The main features of the package, as well as some illustrative examples of its use are detailed throughout this manuscript. (C) 2018 Elsevier B.V. All rights reserved.
引用
收藏
页码:329 / 341
页数:13
相关论文
共 50 条
  • [41] A novel oversampling and feature selection hybrid algorithm for imbalanced data classification
    Fang Feng
    Kuan-Ching Li
    Erfu Yang
    Qingguo Zhou
    Lihong Han
    Amir Hussain
    Mingjiang Cai
    Multimedia Tools and Applications, 2023, 82 : 3231 - 3267
  • [42] Imbalanced fault classification of rolling bearing based on an improved oversampling method
    Han, Yanfang
    Li, Baozhu
    Huang, Yingkun
    Li, Liang
    Yan, Kang
    JOURNAL OF THE BRAZILIAN SOCIETY OF MECHANICAL SCIENCES AND ENGINEERING, 2023, 45 (04)
  • [43] A Combined Priori and Purity Gaussian OverSampling Algorithm for Imbalanced Data Classification
    Tao, Liangliang
    Zhu, Huping
    Wang, Qingya
    Liang, Yage
    Deng, Xiaozheng
    IEEE ACCESS, 2023, 11 : 130688 - 130696
  • [44] CSMOUTE: Combined Synthetic Oversampling and Undersampling Technique for Imbalanced Data Classification
    Koziarski, Michal
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [45] An Adaptive Safe-Region Diversity Oversampling Algorithm for Imbalanced Classification
    Tao, Liangliang
    Li, Huixian
    Wang, Faqiang
    Liu, Maomao
    Tang, Zhao
    Wang, Qingya
    IEEE ACCESS, 2024, 12 : 63713 - 63724
  • [46] Combining Random Subspace Approach with smote Oversampling for Imbalanced Data Classification
    Ksieniewicz, Pawel
    HYBRID ARTIFICIAL INTELLIGENT SYSTEMS, HAIS 2019, 2019, 11734 : 660 - 673
  • [47] Binary imbalanced data classification based on diversity oversampling by generative models
    Zhai, Junhai
    Qi, Jiaxing
    Shen, Chu
    INFORMATION SCIENCES, 2022, 585 : 313 - 343
  • [48] An Improving Majority Weighted Minority Oversampling Technique for Imbalanced Classification Problem
    Wang, Chao-Ran
    Shao, Xin-Hui
    IEEE ACCESS, 2021, 9 : 5069 - 5082
  • [49] Imbalanced fault classification of rolling bearing based on an improved oversampling method
    Yanfang Han
    Baozhu Li
    Yingkun Huang
    Liang Li
    Kang Yan
    Journal of the Brazilian Society of Mechanical Sciences and Engineering, 2023, 45
  • [50] SGO: An innovative oversampling approach for imbalanced datasets using SVM and genetic algorithms
    Deng, Jianfeng
    Wang, Dongmei
    Gu, Jinan
    Chen, Chen
    INFORMATION SCIENCES, 2025, 690