IGTree: Using trees for compression and classification in lazy learning algorithms

被引:65
|
作者
Daelemans, W [1 ]
VandenBosch, A [1 ]
Weijters, T [1 ]
机构
[1] MAASTRICHT UNIV,MATRIKS,MAASTRICHT,NETHERLANDS
关键词
lazy learning; eager learning; decision trees; information gain; data compression; instance base indexing;
D O I
10.1023/A:1006506017891
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We describe the IGTree learning algorithm, which compresses an instance base into a tree structure. The concept of information gain is used as a heuristic function for performing this compression. IGTree produces trees that, compared to other lazy learning approaches, reduce storage requirements and the time required to compute classifications. Furthermore, we obtained similar or better generalization accuracy with IGTree when trained on two complex linguistic tasks, viz. letter-phoneme transliteration and part-of-speech-tagging, when compared to alternative lazy learning and decision tree approaches (viz., IB1, information-gain-weighted IB1, and C4.5). A third experiment, with the task of word hyphenation, demonstrates that when the mutual differences in information gain of features is too small, IGTree as well as information-gain-weighted IB1 perform worse than IB1. These results indicate that IGTree is a useful algorithm for problems characterized by the availability of a large number of training instances described by symbolic features with sufficiently differing information gain values.
引用
收藏
页码:407 / 423
页数:17
相关论文
共 50 条
  • [41] Liver Diseases Classification Using Machine Learning Algorithms
    Jovovic, Ivan
    Grebovic, Marko
    Pokvic, Lejla Gurbeta
    Popovic, Tomo
    Cakic, Stevan
    MEDICON 2023 AND CMBEBIH 2023, VOL 1, 2024, 93 : 585 - 593
  • [42] Classification of Swallowing Foods Using Machine Learning Algorithms
    Lim, Ji Hyun
    Djuric, Petar M.
    Stanacevic, Milutin
    INTERNATIONAL CONFERENCE ON ELECTRICAL, COMPUTER AND ENERGY TECHNOLOGIES (ICECET 2021), 2021, : 1571 - 1574
  • [43] Viral sequence classification using deep learning algorithms
    Nieuwenhuijse, David
    Munnink, Bas Oude
    Phan, My
    Koopmans, Marion
    VIRUS EVOLUTION, 2019, 5 : S19 - S19
  • [44] Network Traffic Classification Using Supervised Learning Algorithms
    Choudhury, Mira Rani
    Muraleedharan, N.
    Acharjee, Parimal
    George, Aleena Terese
    2023 INTERNATIONAL CONFERENCE ON COMPUTER, ELECTRICAL & COMMUNICATION ENGINEERING, ICCECE, 2023,
  • [45] Protostellar classification using supervised machine learning algorithms
    O. Miettinen
    Astrophysics and Space Science, 2018, 363
  • [46] Water quality classification using machine learning algorithms
    Nasir, Nida
    Kansal, Afreen
    Alshaltone, Omar
    Barneih, Feras
    Sameer, Mustafa
    Shanableh, Abdallah
    Al-Shamma'a, Ahmed
    JOURNAL OF WATER PROCESS ENGINEERING, 2022, 48
  • [47] Image classification using HTM cortical learning algorithms
    Zhuo, Wen
    Cao, Zhiguo
    Qin, Yueming
    Yu, Zhenghong
    Xiao, Yang
    2012 21ST INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR 2012), 2012, : 2452 - 2455
  • [48] Classification of Customer Reviews Using Machine Learning Algorithms
    Noori, Behrooz
    APPLIED ARTIFICIAL INTELLIGENCE, 2021, 35 (08) : 567 - 588
  • [49] Classification of Logging Data Using Machine Learning Algorithms
    Mukhamediev, Ravil
    Kuchin, Yan
    Yunicheva, Nadiya
    Kalpeyeva, Zhuldyz
    Muhamedijeva, Elena
    Gopejenko, Viktors
    Rystygulov, Panabek
    APPLIED SCIENCES-BASEL, 2024, 14 (17):
  • [50] Protostellar classification using supervised machine learning algorithms
    Miettinen, O.
    ASTROPHYSICS AND SPACE SCIENCE, 2018, 363 (09)