On a systematic test of ML-based systems: Experiments on test statistics

被引:0
|
作者
Grube, Nicolas [1 ]
Massah, Mozhdeh [1 ]
Tebbe, Michael [1 ]
Wancura, Paul [1 ]
Wiesbrock, Hans-Werner [1 ]
Grossmann, Juergen [2 ]
Kharma, Sami [2 ]
机构
[1] ITPower Solut GmbH, Berlin, Germany
[2] Fraunhofer Inst Offene Kommunikat Syst FOKUS, Berlin, Germany
关键词
Testing AI Systems; Black Box Test for AI Systems; Systematic Evaluation of Training data sets; Probabilistic Modeling;
D O I
10.1109/AITest62860.2024.00010
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Machine learning (ML)-based systems are becoming increasingly ubiquitous even in safety critical environments. The strength of ML systems, to solve complex problems with a stochastic model, leads to challenges in the testing domain. This motivates us to introduce a rigorous testing method for ML-models and their application environment akin to classical software testing, which is independent of the training process and considers the probabilistic nature of ML. The approach is based on the concept of the Probabilistically Extended ONtology (PEON). In brief, PEON is a an ontology modeling the designated Operational Design Domain (ODD), which is extended by assigning probability distributions to classes and their individual attributes, as well as probabilistic dependencies between these attributes. The relevant statistical key figures like accuracy depend not only on the ML-based model but also strongly on the statistics of the test data set, which we refer to by quality assurance (QA) data set, to emphasize its independence from the test data set in the training process. This implies that we have to consider the statistical properties of the QA data in order to evaluate an ML-based system. In this paper we present first experimental results comparing established test selection methods e.g. N-wise, with a new approach the PEON. Our findings strongly suggest, that the underlying statistical properties of the QA data significantly influence the test results of ML-based systems. In this respect, careful attention must be paid to the statistical independence and balance of the QA data. The PEON provides a good basis for the composition of QA data sets, which are not only independent of the development process but also statistically representative and balanced with respect to the modeled ODD.
引用
收藏
页码:11 / 20
页数:10
相关论文
共 50 条
  • [41] On the Robustness of ML-Based Network Intrusion Detection Systems: An Adversarial and Distribution Shift Perspective
    Wang, Minxiao
    Yang, Ning
    Gunasinghe, Dulaj H.
    Weng, Ning
    COMPUTERS, 2023, 12 (10)
  • [42] Adaptive Rate/Power Control With ML-Based Channel Prediction for Optical Satellite Systems
    Nguyen, Tinh V.
    Le, Hoang D.
    Pham, Anh T.
    IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS, 2024, 60 (05) : 7498 - 7509
  • [43] Evaluation of ML-based Positioning Systems with RSSI Measured on the User's Device or APs
    Estrada, Rebeca
    Aizaga, Xavier
    Vera, Nelson
    Asanza, Victor
    PROCEEDINGS OF THE INT'L ACM SYMPOSIUM ON PERFORMANCE EVALUATION OF WIRELESS AD HOC, SENSOR, & UBIQUITOUS NETWORKS, PE-WASUN 2023, 2023, : 77 - 82
  • [44] On the Out-of-Distribution Evaluation of ML-Based End-to-End Communications Systems
    Akrout, Mohamed
    Bellili, Faouzi
    Mezghani, Amine
    Hossain, Ekram
    ICC 2024 - IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, 2024, : 962 - 967
  • [45] ML-based follower jamming rejection in slow FH/MFSK systems with an antenna array
    Department of Electrical and Computer Engineering, National University of Singapore, Singapore 117576, Singapore
    不详
    IEEE Trans Commun, 1600, 9 (1536-1544):
  • [46] A Simple Time-Slot ML-based Frequency Tracking Scheme for OFDM Systems
    Yen, Rainfield Y.
    Cheng, T. T.
    Wu, C. M.
    2009 IEEE INTERNATIONAL CONFERENCE ON NETWORKING, SENSING AND CONTROL, VOLS 1 AND 2, 2009, : 519 - 521
  • [47] Performance of ML-Based Carrier Frequency Offset Estimation in CO-OFDM Systems
    Balogun, Muyiwa B.
    Oyerinde, Olutayo O.
    Takawira, Fambirai
    2017 IEEE AFRICON, 2017, : 175 - 180
  • [48] ML-based follower jamming rejection in slow FH/MFSK systems with an antenna array
    Ko, C. C.
    Nguyen-Le, Hung
    Huang, L.
    IEEE TRANSACTIONS ON COMMUNICATIONS, 2008, 56 (09) : 1536 - 1544
  • [49] Hybrid ML-Based Technique to Classify Malicious Activity Using Log Data of Systems
    Mostafa, Almetwally M.
    Altheneyan, Alaa
    Alnuaim, Abeer
    Alhadlaq, Aseel
    APPLIED SCIENCES-BASEL, 2023, 13 (04):
  • [50] ML-based reconfigurable symbol decoder: An alternative for next-generation communication systems
    Srivastava, Saurabh
    Dash, Prajna Parimita
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2022, 114