MDCStream: Stream Data Generator for Testing Analysis Algorithms

被引:3
|
作者
Iglesias, Felix [1 ]
Ojdanic, Denis [1 ]
Hartl, Alexander [1 ]
Zseby, Tanja [1 ]
机构
[1] TU Wien, Inst Telecommun, Vienna, Austria
关键词
data generation; stream data; synthetic data; multi-dimensional data; concept drift; nonstationarity; MATLAB;
D O I
10.1145/3388831.3388832
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The establishment of modern technological paradigms like ubiquitous computing, big data, cyber-physical systems, or communication networks has strongly increased the need for efficient, effective data stream analysis. MDCStream is a MATLAB tool for generating temporal-dependent numerical datasets in order to stress-test stream data classification, clustering, and outlier detection algorithms. MDCStream is built on MDCGen, therefore showing a high flexibility for creating a wide diversity of data scenarios. To show an example of the potential of MDCStream, we tested a stream data clustering algorithm recently proposed in the literature with datasets generated with MDCStream. Datasets were designed to draw challenges related to space geometries and concept drift.
引用
收藏
页码:56 / 63
页数:8
相关论文
共 50 条
  • [31] Boosting Algorithms for Large-Scale Data and Data Batch Stream
    Yoon, Young Joo
    KOREAN JOURNAL OF APPLIED STATISTICS, 2010, 23 (01) : 197 - 206
  • [32] Hybrid electric city bus driving cycle testing and analysis based on data stream
    Zhao Shupeng
    Zhang Shifang
    Li Jiuxi
    Li Jigang
    ICEMI 2007: PROCEEDINGS OF 2007 8TH INTERNATIONAL CONFERENCE ON ELECTRONIC MEASUREMENT & INSTRUMENTS, VOL I, 2007, : 11 - 13
  • [33] Approximation algorithms for wavelet transform coding of data stream.
    Guha, Sudipto
    Harb, Boulos
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2008, 54 (02) : 811 - 830
  • [34] Adaptive algorithms applied to accelerometer biometrics in a data stream context
    Pisani, Paulo Henrique
    Lorena, Ana Carolina
    de Carvalho, Andre C. P. L. F.
    INTELLIGENT DATA ANALYSIS, 2017, 21 (02) : 353 - 370
  • [35] Online algorithms for mining semi-structured data stream
    Asai, T
    Arimura, H
    Abe, K
    Kawasoe, S
    Arikawa, S
    2002 IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2002, : 27 - 34
  • [36] A COMPARISON OF ALGORITHMS FOR INTRUDER DETECTION ON BATCH AND DATA STREAM ENVIRONMENTS
    Rivero Perez, Jorge Luis
    Ribeiro, Bernardete
    Hector Ortiz, Kadir
    REVISTA UNIVERSIDAD Y SOCIEDAD, 2016, 8 (04): : 31 - 41
  • [37] Algorithms for sliding window join over distributed data stream
    Liu, Xuejun
    Qian, Jiangbo
    Jisuanji Gongcheng/Computer Engineering, 2006, 32 (21): : 41 - 43
  • [38] Approximating a data stream for querying and estimation: Algorithms and performance evaluation
    Guha, S
    Koudas, N
    18TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2002, : 567 - 576
  • [39] Data stream based algorithms for wireless sensor network applications
    de Aquino, Andr L. L.
    Figueiredo, Carlos M. S.
    Nakamura, Eduardo F.
    Buriol, Luciana S.
    Loureiro, Antonio A. F.
    Fernandes, Antnio Otvio
    Coelho, Claudionor J. N., Jr.
    21ST INTERNATIONAL CONFERENCE ON ADVANCED NETWORKING AND APPLICATIONS, PROCEEDINGS, 2007, : 869 - +
  • [40] ACQUISITION AND REVIEW OF DIESEL GENERATOR TESTING DATA.
    Hogan, Thomas A.
    IEEE Transactions on Nuclear Science, 1984, NS-32 (01) : 1122 - 1124