Iterative optimization and simplification of hierarchical clusterings

被引:95
作者
Fisher, D
机构
[1] Department of Computer Science, Box 1679, Station B Vanderbilt University, Nashville
关键词
DECISION;
D O I
10.1613/jair.276
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering is often used for discovering structure in data. Clustering systems differ in the objective function used to evaluate clustering quality and the control strategy used to search the space of clusterings. Ideally, the search strategy should consistently construct clusterings of high quality, but be computationally inexpensive as well. In general, we cannot have it both ways, but we can partition the search so that a system inexpensively constructs a 'tentative' clustering for initial examination, followed by iterative optimization, which continues to search in background for improved clusterings. Given this motivation, we evaluate an inexpensive strategy for creating initial clusterings, coupled with several control strategies for iterative optimization, each of which repeatedly modifies an initial clustering in search of a better one. One of these methods appears novel as an iterative optimization strategy in clustering contexts. Once a clustering has been constructed it is judged by analysts - often according to task-specific criteria. Several authors have abstracted these criteria and posited a generic performance task akin to pattern completion, where the error rate over completed patterns is used to 'externally' judge clustering utility. Given this performance task, we adapt resampling-based pruning strategies used by supervised learning systems to the task of simplifying hierarchical clusterings, thus promising to ease post-clustering analysis. Finally, we propose a number of objective functions, based on attribute-selection measures for decision-tree induction, that might perform well on the error rate and simplicity dimensions.
引用
收藏
页码:147 / 179
页数:33
相关论文
共 53 条
[1]  
AHN W, 1989, P 11 ANN C COGN SCI, P315
[2]  
Anderson J.R., 1991, Concept Formation: Knowledge and Experience in Unsupervised Learning
[3]  
[Anonymous], 1993, Proceedings of the 10th International Conference on Machine Learning
[4]  
[Anonymous], 1991, Computer systems that learn classification and prediction methods from statistics, neural nets, machine learning and expert systems
[5]  
[Anonymous], CONCEPT FORMATION KN
[6]  
BISWAS G, 1994, INNOVATIVE APPLICATI
[7]  
BISWAS G, 1991, 8TH P INT MACH LEARN, P591
[8]  
CHEESEMAN P, 1988, 5TH P INT C MACH LEA, P54
[9]   EXPLAINING BASIC CATEGORIES - FEATURE PREDICTABILITY AND INFORMATION [J].
CORTER, JE ;
GLUCK, MA .
PSYCHOLOGICAL BULLETIN, 1992, 111 (02) :291-303
[10]  
DAVEIGA FD, 1994, THESIS U COIMBRA