MDCStream: Stream Data Generator for Testing Analysis Algorithms

被引:3
|
作者
Iglesias, Felix [1 ]
Ojdanic, Denis [1 ]
Hartl, Alexander [1 ]
Zseby, Tanja [1 ]
机构
[1] TU Wien, Inst Telecommun, Vienna, Austria
关键词
data generation; stream data; synthetic data; multi-dimensional data; concept drift; nonstationarity; MATLAB;
D O I
10.1145/3388831.3388832
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The establishment of modern technological paradigms like ubiquitous computing, big data, cyber-physical systems, or communication networks has strongly increased the need for efficient, effective data stream analysis. MDCStream is a MATLAB tool for generating temporal-dependent numerical datasets in order to stress-test stream data classification, clustering, and outlier detection algorithms. MDCStream is built on MDCGen, therefore showing a high flexibility for creating a wide diversity of data scenarios. To show an example of the potential of MDCStream, we tested a stream data clustering algorithm recently proposed in the literature with datasets generated with MDCStream. Datasets were designed to draw challenges related to space geometries and concept drift.
引用
收藏
页码:56 / 63
页数:8
相关论文
共 50 条
  • [21] Data Stream Algorithms via Expander Graphs
    Ganguly, Sumit
    ALGORITHMS AND COMPUTATION, PROCEEDINGS, 2008, 5369 : 52 - 63
  • [22] Optimizing Data Stream Representation: An Extensive Survey on Stream Clustering Algorithms
    Carnein, Matthias
    Trautmann, Heike
    BUSINESS & INFORMATION SYSTEMS ENGINEERING, 2019, 61 (03) : 277 - 297
  • [23] A Review of Uncertain Data Stream Clustering Algorithms
    Yang, Yue
    Liu, Zhuo
    Xing, Zhidan
    2015 EIGHTH INTERNATIONAL CONFERENCE ON INTERNET COMPUTING FOR SCIENCE AND ENGINEERING (ICICSE), 2015, : 111 - 116
  • [24] Testing and Analysis of Activities of Daily Living Data with Machine Learning Algorithms
    Cufoglu, Ayse
    Coskun, Adem
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2016, 7 (03) : 436 - 441
  • [25] SynTReN: a generator of synthetic gene expression data for design and analysis of structure learning algorithms
    Tim Van den Bulcke
    Koenraad Van Leemput
    Bart Naudts
    Piet van Remortel
    Hongwu Ma
    Alain Verschoren
    Bart De Moor
    Kathleen Marchal
    BMC Bioinformatics, 7
  • [26] An Automatic ECG Generator for Testing and Evaluating ECG Sensor Algorithms
    Al-Hamadi, Hussam
    Gawanmeh, Amjad
    Al-Qutayri, Mahmoud
    2015 10TH INTERNATIONAL DESIGN & TEST SYMPOSIUM (IDT), 2015, : 78 - 83
  • [27] SynTReN: a generator of synthetic gene expression data for design and analysis of structure learning algorithms
    Van den Bulcke, T
    Van Leemput, K
    Naudts, B
    van Remortel, P
    Ma, HW
    Verschoren, A
    De Moor, B
    Marchal, K
    BMC BIOINFORMATICS, 2006, 7 (1)
  • [28] Incremental Learning Algorithms for Fast Classification in Data Stream
    Fong, Simon
    Luo, Zhicong
    Yap, Bee Wah
    2013 INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL AND BUSINESS INTELLIGENCE (ISCBI), 2013, : 186 - +
  • [29] MSTGen: Simulated Data Generator for Multistage Testing
    Han, Kyung T.
    APPLIED PSYCHOLOGICAL MEASUREMENT, 2013, 37 (08) : 666 - 668
  • [30] ACQUISITION AND REVIEW OF DIESEL GENERATOR TESTING DATA
    HOGAN, TA
    IEEE TRANSACTIONS ON NUCLEAR SCIENCE, 1985, 32 (01) : 1122 - 1124