vivaGen - a survival data set generator for software testing

被引:0
|
作者
Gietzelt, Matthias [1 ,2 ,3 ]
Karmen, Christian [1 ]
Knaup-Gregori, Petra [1 ]
Ganzinger, Matthias [1 ]
机构
[1] Heidelberg Univ, Inst Med Biometry & Informat, Neuenheimer Feld 130-3, D-69120 Heidelberg, Germany
[2] TU Braunschweig, Peter L Reichertz Inst Med Informat, Carl Neuberg Str 1, D-30625 Hannover, Germany
[3] Hannover Med Sch, Carl Neuberg Str 1, D-30625 Hannover, Germany
关键词
Data set generator; Survival data; Biomarker; !text type='Java']Java[!/text;
D O I
10.1186/s12859-020-3478-x
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Software testing is an essential part of the software development process, but real-world data may not be suited or available for testing purposes. In the medical context, it can be especially hard to get the necessary test data for various reasons such as privacy concerns. To overcome these obstacles and provide data for the necessary thorough tests of software, the generation of simulated data sets can be a solution. In this paper, we focus on the challenging task of generating such survival data sets containing known effects. So far, no user-friendly software exists for the simulation of survival data, as they are typically derived from clinical trials with follow-ups. Results: To overcome these shortcomings, we developed an easy to use software package called vivaGen. In our Java software, parameters of survival time distributions are replaced by comprehensive measures that can be configured more intuitive by practitioners. vivaGen is equipped with a graphical frontend that allows users to adjust parameters and visualize the results in survival plots of the simulated cohorts. Conclusions: vivaGen is freely available and published as open source. It provides a novel way to generate test data sets based on probability distributions in a comprehensive and user-friendly way.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] Testing Homogeneity Of A Large Data Set By Bootstrapping
    Morimune, K.
    Hoshino, Y.
    MODSIM 2005: INTERNATIONAL CONGRESS ON MODELLING AND SIMULATION: ADVANCES AND APPLICATIONS FOR MANAGEMENT AND DECISION MAKING: ADVANCES AND APPLICATIONS FOR MANAGEMENT AND DECISION MAKING, 2005, : 914 - 919
  • [22] Testing for deterministic trends in streamflow data set
    Sun, HG
    Pantula, SG
    AMERICAN STATISTICAL ASSOCIATION - 1996 PROCEEDINGS OF THE SECTION ON STATISTICS AND THE ENVIRONMENT, 1996, : 63 - 67
  • [23] Observation data set of Subaru Observation Software System
    Kosugi, G
    Sasaki, T
    Mizumoto, Y
    Takata, T
    Kawai, JA
    Ishihara, Y
    OBSERVATORY OPERATIONS TO OPTIMIZE SCIENTIFIC RETURN, 1998, 3349 : 421 - 426
  • [24] Class Imbalance in Software Fault Prediction Data Set
    Arun, C.
    Lakshmi, C.
    ARTIFICIAL INTELLIGENCE AND EVOLUTIONARY COMPUTATIONS IN ENGINEERING SYSTEMS, 2020, 1056 : 745 - 757
  • [25] ACQUISITION AND REVIEW OF DIESEL GENERATOR TESTING DATA.
    Hogan, Thomas A.
    IEEE Transactions on Nuclear Science, 1984, NS-32 (01) : 1122 - 1124
  • [26] A uniform random test data generator for path testing
    Gotlieb, Arnaud
    Petit, Matthieu
    JOURNAL OF SYSTEMS AND SOFTWARE, 2010, 83 (12) : 2618 - 2626
  • [27] COCOA: A Synthetic Data Generator for Testing Anonymization Techniques
    Ayala-Rivera, Vanessa
    Portillo-Dominguez, A. Omar
    Murphy, Liam
    Thorpe, Christina
    PRIVACY IN STATISTICAL DATABASES: UNESCO CHAIR IN DATA PRIVACY, 2016, 9867 : 163 - 177
  • [28] MDCStream: Stream Data Generator for Testing Analysis Algorithms
    Iglesias, Felix
    Ojdanic, Denis
    Hartl, Alexander
    Zseby, Tanja
    PROCEEDINGS OF THE 13TH EAI INTERNATIONAL CONFERENCE ON PERFORMANCE EVALUATION METHODOLOGIES AND TOOLS ( VALUETOOLS 2020), 2020, : 56 - 63
  • [29] A HIGH-SPEED DATA GENERATOR FOR DIGITAL TESTING
    HUBNER, U
    BERKEL, W
    NUSSLE, H
    BECKER, J
    HEWLETT-PACKARD JOURNAL, 1983, 34 (07): : 7 - 14
  • [30] SYNTHETIC DATA GENERATOR FOR TESTING OF CLASSIFICATION RULE ALGORITHMS
    Seidlova, R.
    Pozivil, J.
    Seidl, J.
    Malecl, L.
    NEURAL NETWORK WORLD, 2017, 27 (02) : 215 - 229