Learning model trees from evolving data streams

被引:193
|
作者
Ikonomovska, Elena [1 ,4 ]
Gama, Joao [2 ,3 ]
Dzeroski, Saso [1 ]
机构
[1] Jozef Stefan Inst, Ljubljana 1000, Slovenia
[2] Univ Porto, LIAAD INESC, P-4050190 Oporto, Portugal
[3] Univ Porto, Fac Econ, P-4200 Oporto, Portugal
[4] Ss Cyril & Methodius Univ, Fac Elect Engn & Informat Technol, Skopje 1000, Macedonia
关键词
Non-stationary data streams; Stream data mining; Regression trees; Model trees; Incremental algorithms; On-line learning; Concept drift; On-line change detection; REGRESSION TREES; DRIFT;
D O I
10.1007/s10618-010-0201-y
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The problem of real-time extraction of meaningful patterns from time-changing data streams is of increasing importance for the machine learning and data mining communities. Regression in time-changing data streams is a relatively unexplored topic, despite the apparent applications. This paper proposes an efficient and incremental stream mining algorithm which is able to learn regression and model trees from possibly unbounded, high-speed and time-changing data streams. The algorithm is evaluated extensively in a variety of settings involving artificial and real data. To the best of our knowledge there is no other general purpose algorithm for incremental learning regression/model trees able to perform explicit change detection and informed adaptation. The algorithm performs online and in real-time, observes each example only once at the speed of arrival, and maintains at any-time a ready-to-use model tree. The tree leaves contain linear models induced online from the examples assigned to them, a process with low complexity. The algorithm has mechanisms for drift detection and model adaptation, which enable it to maintain accurate and updated regression models at any time. The drift detection mechanism exploits the structure of the tree in the process of local change detection. As a response to local drift, the algorithm is able to update the tree structure only locally. This approach improves the any-time performance and greatly reduces the costs of adaptation.
引用
收藏
页码:128 / 168
页数:41
相关论文
共 50 条
  • [41] Regression Trees from Data Streams with Drift Detection
    Ikonomovska, Elena
    Gama, Joao
    Sebastiao, Raquel
    Gjorgjevik, Dejan
    DISCOVERY SCIENCE, PROCEEDINGS, 2009, 5808 : 121 - +
  • [42] Predicting Future Decision Trees from Evolving Data
    Boettcher, Mirko
    Spott, Martin
    Kruse, Rudolf
    ICDM 2008: EIGHTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2008, : 33 - +
  • [43] Random Forests of Very Fast Decision Trees on GPU for Mining Evolving Big Data Streams
    Marron, Diego
    Bifet, Albert
    Morales, Gianmarco De Francisci
    21ST EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE (ECAI 2014), 2014, 263 : 615 - +
  • [44] An Overview on Learning from Data Streams
    João Gama
    Pedro Rodrigues
    Jesús Aguilar-Ruiz
    New Generation Computing, 2006, 25 (1) : 1 - 4
  • [45] Active learning from data streams
    Zhu, Xingquan
    Zhang, Peng
    Lin, Xiaodong
    Shi, Yong
    ICDM 2007: PROCEEDINGS OF THE SEVENTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2007, : 757 - +
  • [46] Online Evaluation of Patterns from Evolving Web Data Streams
    Rojas, Carlos
    Nasraoui, Olfa
    2009 IEEE/WIC/ACM INTERNATIONAL JOINT CONFERENCES ON WEB INTELLIGENCE (WI) AND INTELLIGENT AGENT TECHNOLOGIES (IAT), VOL 1, 2009, : 315 - 318
  • [47] Evolving granular neural networks from fuzzy data streams
    Leite, Daniel
    Costa, Pyramo
    Gomide, Fernando
    NEURAL NETWORKS, 2013, 38 : 1 - 16
  • [48] Online Semi-supervised Learning from Evolving Data Streams with Meta-features and Deep Reinforcement Learning
    Vafaie, Parsa
    Viktor, Herna
    Paquet, Eric
    Michalowski, Wojtek
    MACHINE LEARNING, OPTIMIZATION, AND DATA SCIENCE (LOD 2021), PT II, 2022, 13164 : 70 - 85
  • [49] Online Learning Model for Handling Different Concept Drifts Using Diverse Ensemble Classifiers on Evolving Data Streams
    Ancy, S.
    Paulraj, D.
    CYBERNETICS AND SYSTEMS, 2019, 50 (07) : 579 - 608
  • [50] On change diagnosis in evolving data streams
    Aggarwal, CC
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2005, 17 (05) : 587 - 600