Beam search induction and similarity constraints for predictive clustering trees

被引:0
|
作者
Kocev, Dragi [1 ]
Struyf, Jan [2 ]
Dzeroski, Saso [1 ]
机构
[1] Jozef Stefan Inst, Dept Knowledge Technol, Jamova 39, Ljubljana 1000, Slovenia
[2] Katholieke Univ Leuven, Dept Comp Sci, B-3001 Heverlee, Belgium
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Much research on inductive databases (IDBs) focuses on local models, such as item sets and association rules. In this work, we investigate how IDBs can support global models, such as decision trees. Our focus is on predictive clustering trees (PCTs). PCTs generalize decision trees and can be used for prediction and clustering, two of the most common data mining tasks. Regular PCT induction builds PCTs top-down, using a greedy algorithm, similar to that of C4.5. We propose a new induction algorithm for PCTs based on beam search. This has three advantages over the regular method: (a) it returns a set of PCTs satisfying the user constraints instead of just one PCT; (b) it better allows for pushing of user constraints into the induction algorithm; and (c) it is less susceptible to myopia. In addition, we propose similarity constraints for PCTs, which improve the diversity of the resulting PCT set.
引用
收藏
页码:134 / +
页数:2
相关论文
共 50 条
  • [21] SIMBSIG: similarity search and clustering for biobank-scale data
    Adamer, Michael F.
    Roellin, Eljas
    Bourguignon, Lucie
    Borgwardt, Karsten
    BIOINFORMATICS, 2023, 39 (01)
  • [22] Clustering for approximate similarity search in high-dimensional spaces
    Li, C
    Chang, E
    Garcia-Molina, H
    Wiederhold, G
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2002, 14 (04) : 792 - 808
  • [23] Dealing with spatial autocorrelation when learning predictive clustering trees
    Stojanova, Daniela
    Ceci, Michelangelo
    Appice, Annalisa
    Malerba, Donato
    Dzeroski, Saso
    ECOLOGICAL INFORMATICS, 2013, 13 : 22 - 39
  • [24] Survival analysis with semi-supervised predictive clustering trees
    Roy, Bijit
    Stepis, Tomaz
    Pooled Resource Open-Access Als Clinical Trials Consortium, The
    Vens, Celine
    Dzeroski, Saso
    COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 141
  • [25] Medical Prescription Recognition Using Heuristic Clustering and Similarity Search
    Ngoc-Thao Nguyen
    Hieu Vo
    Khanh Tran
    Duy Ha
    Duc Nguyen
    Thanh Le
    COMPUTATIONAL COLLECTIVE INTELLIGENCE, ICCCI 2022, 2022, 13501 : 768 - 780
  • [26] Compact representation for large-scale clustering and similarity search
    Wang, Bin
    Chen, Yuanhao
    Lie, Zhiwei
    Lie, Mingjing
    Advances in Multimedia Information Processing - PCM 2006, Proceedings, 2006, 4261 : 835 - 843
  • [27] Redescription Mining with Multi-target Predictive Clustering Trees
    Mihelcic, Matej
    Dzeroski, Saso
    Lavrac, Nada
    Smuc, Tomislav
    NEW FRONTIERS IN MINING COMPLEX PATTERNS, 2016, 9607 : 125 - 143
  • [28] Option predictive clustering trees for multi-target regression
    Stepisnik, Tomaz
    Osojnik, Aljaz
    Dzeroski, Saso
    Kocev, Dragi
    COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2020, 17 (02) : 459 - 486
  • [29] Fast and Scalable Image Retrieval Using Predictive Clustering Trees
    Dimitrovski, Ivica
    Kocev, Dragi
    Loskovska, Suzana
    Dzeroski, Saso
    DISCOVERY SCIENCE, 2013, 8140 : 33 - 48
  • [30] Option Predictive Clustering Trees for Multi-target Regression
    Osojnik, Aljaz
    Dzeroski, Saso
    Kocev, Dragi
    DISCOVERY SCIENCE, (DS 2016), 2016, 9956 : 118 - 133