An online ensembles approach for handling concept drift in data streams: diversified online ensembles detection

被引:0
|
作者
Parneeta Sidhu
M. P. S. Bhatia
机构
[1] Netaji Subhas Institute of Technology,Division of CoE
关键词
Concept drift; Ensemble; Diversity; Data stream; Online learning;
D O I
暂无
中图分类号
学科分类号
摘要
Data Streams are continuous data instances arriving at a very high speed with varying underlying conceptual distribution. We present a novel online ensemble approach, Diversified online ensembles detection (DOED), for handling these drifting concepts in data streams. Our approach maintains two ensembles of weighted experts, an ensemble with low diversity and an ensemble with high diversity, which are updated as per their accuracy in classifying the new data instances. Our approach detects drifts by comparing the two accuracies: an accuracy of an ensemble on the recent examples and its accuracy since the beginning of the learning. The final prediction for an instance is the class predicted by the ensemble which gives better accuracy in classifying the recent examples. When a drift is detected by an ensemble, it is reinitialized still maintaining its diversity levels. Experimental evaluation using various artificial and real-world datasets proves that DOED provides very high accuracy in classifying new data instances, irrespective of the size of dataset, type of drift or presence of noise. We compare DOED with the other learners in terms of new performance metrics such as kappa statistic, model cost, and the evaluation time and memory requirements. Our approach proved to be highly resource effective achieving very high accuracies even in a resource constrained environment.
引用
收藏
页码:883 / 909
页数:26
相关论文
共 50 条
  • [31] Diversified SVM ensembles for large data sets
    Tsang, Ivor W.
    Kocsor, Andras
    Kwok, James T.
    MACHINE LEARNING: ECML 2006, PROCEEDINGS, 2006, 4212 : 792 - 800
  • [32] An overview and comprehensive comparison of ensembles for concept drift
    Maior de Barros, Roberto Souto
    de Carvalho Santos, Silas Garrido T.
    INFORMATION FUSION, 2019, 52 : 213 - 244
  • [33] Calculating Feature Importance in Data Streams with Concept Drift using Online Random Forest
    Cassidy, Andrew Phelps
    Deviney, Frank A., Jr.
    2014 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2014,
  • [34] Idea Bank: Rehearsing Ensembles Online
    Goodman, Shawn L.
    MUSIC EDUCATORS JOURNAL, 2020, 107 (01) : 15 - 16
  • [35] Online Approach to Handle Concept Drifting Data Streams using Diversity
    Sidhu, Parneeta
    Bhatia, Mohinder
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2017, 14 (03) : 293 - 299
  • [36] Online Detection of Concept Drift in Visual Tracking
    Liu, Yichen
    Zhou, Yue
    NEURAL INFORMATION PROCESSING, ICONIP 2014, PT III, 2014, 8836 : 159 - 166
  • [37] An online fuzzy model for classification of data streams with drift
    Shahparast, Homeira
    Mansoori, Eghbal G.
    2017 19TH CSI INTERNATIONAL SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND SIGNAL PROCESSING (AISP), 2017, : 91 - 95
  • [38] Online Outlier Detection for Data Streams
    Sadik, Shiblee
    Gruenwald, Le
    PROCEEDINGS OF THE 15TH INTERNATIONAL DATABASE ENGINEERING & APPLICATIONS SYMPOSIUM (IDEAS '11), 2011, : 88 - 96
  • [39] Concept Drift Detection and Model Selection with Simulated Recurrence and Ensembles of Statistical Detectors
    Sobolewski, Piotr
    Wozniak, Michal
    JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2013, 19 (04) : 462 - 483
  • [40] Online Clustering for Evolving Data Streams with Online Anomaly Detection
    Chenaghlou, Milad
    Moshtaghi, Masud
    Leckie, Christopher
    Salehi, Mahsa
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2018, PT II, 2018, 10938 : 506 - 519