Imbalanced Data Over-Sampling Method Based on ISODATA Clustering

被引:0
|
作者
Lv, Zhenzhe [1 ]
Liu, Qicheng [1 ]
机构
[1] Yantai Univ, Sch Comp & Control Engn, Yantai 264000, Shandong, Peoples R China
基金
中国国家自然科学基金;
关键词
imbalanced data; clustering; oversampling; ISODATA; SMOTE;
D O I
10.1587/transinf.2022EDP7190
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Class imbalance is one of the challenges faced in the field of machine learning. It is difficult for traditional classifiers to predict the minority class data. If the imbalanced data is not processed, the effect of the classifier will be greatly reduced. Aiming at the problem that the traditional classifier tends to the majority class data and ignores the minority class data, imbalanced data over-sampling method based on iterative self-organizing data analysis technique algorithm(ISODATA) clustering is proposed. The minority class is divided into different sub-clusters by ISO DATA, and each sub-cluster is over-sampled according to the sampling ratio, so that the sampled minority class data also conforms to the imbalance of the original minority class data. The new imbalanced data composed of new minority class data and majority class data is classified by SVM and Random Forest classifier. Experiments on 12 datasets from the KEEL datasets show that the method has better G-means and F-value, improving the classification accuracy. counts the of to cancer tient sifies and
引用
收藏
页码:1528 / 1536
页数:9
相关论文
共 50 条
  • [31] An over-sampling expert system for learning from imbalanced data sets
    He, GX
    Han, H
    Wang, WY
    PROCEEDINGS OF THE 2005 INTERNATIONAL CONFERENCE ON NEURAL NETWORKS AND BRAIN, VOLS 1-3, 2005, : 537 - 541
  • [32] Improving Diagnostic Performance of a Power Transformer Using an Adaptive Over-Sampling Method for Imbalanced Data
    Tra, Viet
    Bach-Phi Duong
    Kim, Jong-Myon
    IEEE TRANSACTIONS ON DIELECTRICS AND ELECTRICAL INSULATION, 2019, 26 (04) : 1325 - 1333
  • [33] Abstention-SMOTE: An over-sampling approach for imbalanced data classification
    Zhang, Cheng
    Chen, Yufei
    Liu, Xianhui
    Zhao, Xiaodong
    PROCEEDINGS OF THE 2017 INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY (ICIT 2017), 2017, : 17 - 21
  • [34] A Learning Approach with Under-and Over-sampling for Imbalanced Data Sets
    Yeh, Chun-Wu
    Li, Der-Chiang
    Lin, Liang-Sian
    Tsai, Tung-I
    PROCEEDINGS 2016 5TH IIAI INTERNATIONAL CONGRESS ON ADVANCED APPLIED INFORMATICS IIAI-AAI 2016, 2016, : 725 - 729
  • [35] Imbalanced Node Classification With Synthetic Over-Sampling
    Zhao, Tianxiang
    Zhang, Xiang
    Wang, Suhang
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (12) : 8515 - 8528
  • [36] RWO-Sampling: A random walk over-sampling approach to imbalanced data classification
    Zhang, Huaxiang
    Li, Mingfang
    INFORMATION FUSION, 2014, 20 : 99 - 116
  • [37] AN IMBALANCED SIGNAL MODULATION CLASSIFICATION AND EVALUATION METHOD BASED ON SYNTHETIC MINORITY OVER-SAMPLING TECHNIQUE
    Liu, Xuebo
    Wang, Yiran
    Bai, Jing
    Li, Haoran
    Wang, Xu
    IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 6224 - 6227
  • [38] Borderline over-sampling in feature space for learning algorithms in imbalanced data environments
    Savetratanakaree, Kittipat (kittipatsavet@gmail.com), 1600, International Association of Engineers (43):
  • [39] Noise Reduction A Priori Synthetic Over-Sampling for class imbalanced data sets
    Rivera, William A.
    INFORMATION SCIENCES, 2017, 408 : 146 - 161
  • [40] Imbalanced data classification using improved synthetic minority over-sampling technique
    Anusha, Yamijala
    Visalakshi, R.
    Srinivas, Konda
    MULTIAGENT AND GRID SYSTEMS, 2023, 19 (02) : 117 - 131