Pruning fuzzy ARTMAP using the minimum description length principle in learning from clinical databases

被引:3
|
作者
Lin, TH
Soo, VW
机构
来源
NINTH IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, PROCEEDINGS | 1997年
关键词
D O I
10.1109/TAI.1997.632281
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Fuzzy ARTMAP is one of the families of the neural network architectures bused on ART(Adaptive Resonance Theory) in which supervised learning carl be carried out. However, it usually tends to create more categories than are actually needed. This often causes the so called overfitting problem, namely the performance of the networks in test set is not monotonically increasing with the additional training epochs and category creation, for fuzzy ARTMAP. In order to avoid the overfitting problem, Carpenter and Tan [Carpenter and Tan, 1993] proposed tr confidence-based pruning method by eliminating those categories that were either less useful or less accurate. This paper purposes yet another-alternative pruning method that is based on the Minimal Description Length (MDL) principle. The MDL principle can be viewed as a tradeoff between theory complexity and data prediction accuracy given the theory. We adopted Cameron-Jones' error encoding scheme and Quinlan's modifier for theory encoding to estimate the fuzzy ARTMAP theory description length. A greedy search algorithm of the minimum description length to prune the fuzzy ARTMAP categories one by one is proposed. The experiments showed that fuzzy ARTMAP pinned with the MDL principle Save better performance with far fewer categories created than the original fuzzy ARTMAP and other machine learning systems on a number of benchmark clinical databases such as heart disease, breath cancer and diabetes databases.
引用
收藏
页码:396 / 403
页数:8
相关论文
共 50 条