Learning from Imbalanced Data Using Methods of Sample Selection

被引:0
|
作者
Chairi, Ikram [1 ]
Alaoui, Souad [1 ]
Lyhyaoui, Abdelouahid [1 ]
机构
[1] Abdelmalek Essaadi Univ, LTiLab, ENSA Tangier, Tanger Principal Tanger, Morocco
关键词
Imbalanced data; Multi-Layer Perceptron; sample selection;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The majority of Machine Learning (ML) habitually assume that the training sets used for learning are balanced. However, in real world application this hypothesis is not always true. The problem of between-class imbalance is a challenge that has attracted growing attention from both academia and industry because of his critical influence on the performance of machine learning. Many solutions are proposed to resolve this problem: Generally, the common practice for dealing with imbalanced data sets is to rebalance them artificially by using sampling methods. On the other hand, researches show that Sample Selection (SS) methods help to improve the accuracy during the learning process. The main idea of our work is to apply a technique of Sample Selection on the majority class to achieve an undersampling for the imbalanced data. This procedure consent to deal with the imbalance problem and to improve the performance of learning.
引用
收藏
页码:256 / 259
页数:4
相关论文
共 50 条
  • [31] Addressing sample selection bias for machine learning methods
    Brewer, Dylan
    Carlson, Alyssa
    JOURNAL OF APPLIED ECONOMETRICS, 2024, 39 (03) : 383 - 400
  • [32] Sample size selection in optimization methods for machine learning
    Byrd, Richard H.
    Chin, Gillian M.
    Nocedal, Jorge
    Wu, Yuchen
    MATHEMATICAL PROGRAMMING, 2012, 134 (01) : 127 - 155
  • [33] Sample size selection in optimization methods for machine learning
    Richard H. Byrd
    Gillian M. Chin
    Jorge Nocedal
    Yuchen Wu
    Mathematical Programming, 2012, 134 : 127 - 155
  • [34] Learning from Imbalanced Data Using Over-Sampling and the Firefly Algorithm
    Czarnowski, Ireneusz
    COMPUTATIONAL COLLECTIVE INTELLIGENCE (ICCCI 2021), 2021, 12876 : 373 - 386
  • [35] The Impact of Local Data Characteristics on Learning from Imbalanced Data
    Stefanowski, Jerzy
    ROUGH SETS AND INTELLIGENT SYSTEMS PARADIGMS, RSEISP 2014, 2014, 8537 : 1 - 13
  • [36] Improved Microarray Data Analysis using Feature Selection Methods with Machine Learning Methods
    Sun, Jing
    Passi, Kalpdrum
    Jain, Chakresh Kumar
    2016 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2016, : 1527 - 1534
  • [37] Heart disease detection using deep learning methods from imbalanced ECG samples
    Rath, Adyasha
    Mishra, Debahuti
    Panda, Ganapati
    Satapathy, Suresh Chandra
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2021, 68
  • [38] Learning Patterns from Imbalanced Evolving Data Streams
    Almuammar, Manal
    Fasli, Maria
    2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 2048 - 2057
  • [39] Evolutionary Online Machine Learning from Imbalanced Data
    Stein, Anthony
    2016 IEEE 1ST INTERNATIONAL WORKSHOPS ON FOUNDATIONS AND APPLICATIONS OF SELF* SYSTEMS (FAS*W), 2016, : 281 - 286
  • [40] Metric Learning from Imbalanced Data with Generalization Guarantees
    Gautheron, Leo
    Habrard, Amaury
    Morvant, Emilie
    Sebban, Marc
    PATTERN RECOGNITION LETTERS, 2020, 133 : 298 - 304