IGTree: Using trees for compression and classification in lazy learning algorithms

被引:65
|
作者
Daelemans, W [1 ]
VandenBosch, A [1 ]
Weijters, T [1 ]
机构
[1] MAASTRICHT UNIV,MATRIKS,MAASTRICHT,NETHERLANDS
关键词
lazy learning; eager learning; decision trees; information gain; data compression; instance base indexing;
D O I
10.1023/A:1006506017891
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We describe the IGTree learning algorithm, which compresses an instance base into a tree structure. The concept of information gain is used as a heuristic function for performing this compression. IGTree produces trees that, compared to other lazy learning approaches, reduce storage requirements and the time required to compute classifications. Furthermore, we obtained similar or better generalization accuracy with IGTree when trained on two complex linguistic tasks, viz. letter-phoneme transliteration and part-of-speech-tagging, when compared to alternative lazy learning and decision tree approaches (viz., IB1, information-gain-weighted IB1, and C4.5). A third experiment, with the task of word hyphenation, demonstrates that when the mutual differences in information gain of features is too small, IGTree as well as information-gain-weighted IB1 perform worse than IB1. These results indicate that IGTree is a useful algorithm for problems characterized by the availability of a large number of training instances described by symbolic features with sufficiently differing information gain values.
引用
收藏
页码:407 / 423
页数:17
相关论文
共 50 条
  • [21] Petrofacies classification using machine learning algorithms
    Silva, Adrielle A.
    Tavares, Monica W.
    Carrasquilla, Abel
    Missagia, Roseane
    Ceia, Marco
    GEOPHYSICS, 2020, 85 (04) : WA101 - WA113
  • [22] Classification of Sleep States in Mice using Generic Compression Algorithms
    Mayer, Owen
    Lim, Diane C.
    Pack, Allan I.
    Stamm, Matthew C.
    PROCEEDINGS OF 2016 IEEE SIGNAL PROCESSING IN MEDICINE AND BIOLOGY SYMPOSIUM (SPMB), 2016,
  • [23] Structural Compression of Packet Classification Trees
    Wang, Xiang
    Liu, Zhi
    Qi, Yaxuan
    Li, Jun
    PROCEEDINGS OF THE EIGHTH ACM/IEEE SYMPOSIUM ON ARCHITECTURES FOR NETWORKING AND COMMUNICATIONS SYSTEMS (ANCS'12), 2012, : 83 - 84
  • [24] Domain Classification using B plus Trees in Fractal Image Compression
    Jayamohan, M.
    Revathy, K.
    2012 NATIONAL CONFERENCE ON COMPUTING AND COMMUNICATION SYSTEMS (NCCCS), 2012, : 187 - 191
  • [25] Evolutionary algorithms for classification and regression trees
    Mola, Francesco
    Miele, Raffaele
    DATA ANALYSIS, CLASSIFICATION AND THE FORWARD SEARCH, 2006, : 255 - +
  • [26] Classification of poplar trees with object-based ensemble learning algorithms using Sentinel-2A imagery
    Tonbul, H.
    Colkesen, I
    Kavzoglu, T.
    JOURNAL OF GEODETIC SCIENCE, 2020, 10 (01) : 14 - 22
  • [27] Reducts Evaluation Methods Using Lazy Algorithms
    Delimata, Pawel
    Suraj, Zbigniew
    ROUGH SETS AND KNOWLEDGE TECHNOLOGY, PROCEEDINGS, 2009, 5589 : 120 - 127
  • [28] Skin lesion classification using decision trees and random forest algorithms
    Dhivyaa, C. R.
    Sangeetha, K.
    Balamurugan, M.
    Amaran, Sibi
    Vetriselvi, T.
    Johnpaul, P.
    JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2020, 15 (Suppl 1) : 157 - 157
  • [29] PAC-Bayesian compression bounds on the prediction error of learning algorithms for classification
    Graepel, T
    Herbrich, R
    MACHINE LEARNING, 2005, 59 (1-2) : 55 - 76
  • [30] PAC-Bayesian Compression Bounds on the Prediction Error of Learning Algorithms for Classification
    Thore Graepel
    Ralf Herbrich
    John Shawe-Taylor
    Machine Learning, 2005, 59 : 55 - 76