Gaussian Sampling Approach to deal with Imbalanced Telemetry Datasets in Industrial Applications

被引:0
|
作者
Galve, Sergio [1 ]
Puig, Vicenc [2 ]
Vilajosana, Xavi [1 ]
机构
[1] Univ Oberta Catalunya, Wireless Networks Res Lab, Castelldefels 08860, Barcelona, Spain
[2] UPC, CSIC, Inst Robot & Informat Ind, Barcelona, Spain
关键词
D O I
10.1109/MED59994.2023.10185829
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Practical implementation of data analytics in industrial environments has always been a problematic area because of data availability and quality. In this paper, a Gaussian sampling methodology is proposed to address the problem of imbalanced telemetry datasets that is one of the root causes that make modelling less reliable. By generating subsets that achieve homogeneous density distributions this problem is addressed. By comparing the impact of this method with the baseline case of random sampling, this paper aims to address this problem and propose a practical solution. A case study based on an industrial cooling device is used to assess and illustrate the proposed approach.
引用
收藏
页码:605 / 611
页数:7
相关论文
共 50 条
  • [21] Comparison of Evaluation Metrics in Classification Applications with Imbalanced Datasets
    Fatourechi, Mehrdad
    Ward, Rabab K.
    Mason, Steven G.
    Huggins, Jane
    Schloegl, Alois
    Birch, Gary E.
    SEVENTH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, PROCEEDINGS, 2008, : 777 - +
  • [22] Combining integrated sampling with SVM ensembles for learning from imbalanced datasets
    Liu, Yang
    Yu, Xiaohui
    Huang, Jimmy Xiangji
    An, Aijun
    INFORMATION PROCESSING & MANAGEMENT, 2011, 47 (04) : 617 - 631
  • [23] Handling imbalanced datasets by partially guided hybrid sampling for pattern recognition
    Sandhan, Tushar
    Choi, Jin Young
    2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2014, : 1449 - 1453
  • [24] Cluster-Based Minority Over-Sampling for Imbalanced Datasets
    Puntumapon, Kamthorn
    Rakthamamon, Thanawin
    Waiyamai, Kitsana
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2016, E99D (12): : 3101 - 3109
  • [25] Mitigating false negatives in imbalanced datasets: An ensemble approach
    Vasconcelos, Marcelo
    Cavique, Luis
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 262
  • [26] DAuGAN: An Approach for Augmenting Time Series Imbalanced Datasets via Latent Space Sampling Using Adversarial Techniques
    Bratu, Andrei
    Czibula, Gabriela
    SCIENTIFIC PROGRAMMING, 2021, 2021 (2021)
  • [27] Variable Importance Analysis in Imbalanced Datasets: A New Approach
    Ahrazem Dfuf, Ismael
    Forte Perez-Minayo, Joaquin
    Mira Mcwilliams, Jose Manuel
    Gonzalez Fernandez, Camino
    IEEE ACCESS, 2020, 8 : 127404 - 127430
  • [28] Feature Selection and Ensemble Hierarchical Cluster-based Under-sampling Approach for Extremely Imbalanced Datasets
    Soltani, Sima
    Sadri, Javad
    Torshizi, Hassan Ahmadi
    2011 1ST INTERNATIONAL ECONFERENCE ON COMPUTER AND KNOWLEDGE ENGINEERING (ICCKE), 2011, : 166 - 171
  • [29] Effect of Imbalanced Datasets on Security of Industrial IoT Using Machine Learning
    Zolanvari, Maede
    Teixeira, Marcio A.
    Jain, Raj
    2018 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENCE AND SECURITY INFORMATICS (ISI), 2018, : 112 - 117
  • [30] A Gaussian mixture model based virtual sample generation approach for small datasets in industrial processes
    Li, Ling
    Damarla, Seshu Kumar
    Wang, Yalin
    Huang, Biao
    INFORMATION SCIENCES, 2021, 581 : 262 - 277