dSalmon: High-Speed Anomaly Detection for Evolving Multivariate Data Streams

被引:0
|
作者
Hartl, Alexander [1 ]
Iglesias, Felix [1 ]
Zseby, Tanja [1 ]
机构
[1] TU Wien Inst Telecommun, A-1040 Vienna, Austria
来源
PERFORMANCE EVALUATION METHODOLOGIES AND TOOLS, VALUETOOLS 2023 | 2024年 / 539卷
关键词
Outlier detection; Data streams; Unsupervised learning; !text type='Python']Python[!/text; C plus;
D O I
10.1007/978-3-031-48885-6_10
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
We introduce dSalmon, a highly efficient framework for outlier detection on streaming data. dSalmon can be used with both Python and C++, meeting the requirements of modern data science research. It provides an intuitive interface and has almost no package dependencies. dSalmon implements main stream outlier detection approaches from literature. By using pure C++ in its core and making the most of available parallelism, data is analyzed with superior processing speed. We describe design decisions and outline the software architecture of dSalmon. Additionally, we perform thorough evaluations on benchmarking datasets to measure execution time, memory requirements and energy consumption when performing outlier detection. Experiments show that dSalmon requires substantially less resources and in most cases is able to process datasets between one and three orders of magnitude faster than established Python implementations.
引用
收藏
页码:153 / 169
页数:17
相关论文
共 50 条
  • [41] Tracking triadic cardinality distributions for burst detection in high-speed graph streams
    Zhao, Junzhou
    Wang, Pinghui
    Chen, Zhouguo
    Ding, Jianwei
    Lui, John C. S.
    Towsley, Don
    Guan, Xiaohong
    KNOWLEDGE AND INFORMATION SYSTEMS, 2021, 63 (04) : 939 - 969
  • [42] Anomaly detection in high-dimensional network data streams: A case study
    Zhang, Ji
    Gao, Qigang
    Wang, Hai
    ISI 2008: 2008 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENCE AND SECURITY INFORMATICS, 2008, : 251 - +
  • [43] Visual Structural Assessment and Anomaly Detection for High-Velocity Data Streams
    Rathore, Punit
    Kumar, Dheeraj
    Bezdek, James C.
    Rajasegarar, Sutharshan
    Palaniswami, Marimuthu
    IEEE TRANSACTIONS ON CYBERNETICS, 2021, 51 (12) : 5979 - 5992
  • [44] Nearest Neighbor Classification for High-Speed Big Data Streams Using Spark
    Ramirez-Gallego, Sergio
    Krawczyk, Bartosz
    Garcia, Salvador
    Wozniak, Michal
    Manuel Benitez, Jose
    Herrera, Francisco
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2017, 47 (10): : 2727 - 2739
  • [45] SPinDP: A High-Speed Distributed Processing Platform for Sampling and Filtering Data Streams
    Gil, Myeong-Seon
    Moon, Yang-Sae
    APPLIED SCIENCES-BASEL, 2023, 13 (24):
  • [46] Reversible sketches: Enabling monitoring and analysis over high-speed data streams
    Schweller, Robert
    Li, Zhichun
    Chen, Yan
    Gao, Yan
    Gupta, Ashish
    Zhang, Yin
    Dinda, Peter A.
    Kao, Ming-Yang
    Memik, Gokhan
    IEEE-ACM TRANSACTIONS ON NETWORKING, 2007, 15 (05) : 1059 - 1072
  • [47] Efficient Multipattern Event Processing Over High-Speed Train Data Streams
    Ma, Meng
    Wang, Ping
    Chu, Chao-Hsien
    Liu, Ling
    IEEE INTERNET OF THINGS JOURNAL, 2015, 2 (04): : 295 - 309
  • [48] An Efficient Frequent Itemset Mining Method over High-speed Data Streams
    Memar, Mina
    Deypir, Mahmood
    Sadreddini, Mohammad Hadi
    Fakhrahmad, Seyyed Mostafa
    COMPUTER JOURNAL, 2012, 55 (11): : 1357 - 1366
  • [49] A Cluster-Based Context-Tree Model for Multivariate Data Streams with Applications to Anomaly Detection
    Brice, Pierre
    Jiang, Wei
    Wan, Guohua
    INFORMS JOURNAL ON COMPUTING, 2011, 23 (03) : 364 - 376
  • [50] CORONAL HOLES AND HIGH-SPEED WIND STREAMS
    ZIRKER, JB
    REVIEWS OF GEOPHYSICS, 1977, 15 (03) : 257 - 269