Dense Neighborhoods on Affinity Graph

被引:32
|
作者
Liu, Hairong [1 ]
Yang, Xingwei [2 ]
Latecki, Longin Jan [2 ]
Yan, Shuicheng [1 ]
机构
[1] Natl Univ Singapore, Singapore 117548, Singapore
[2] Temple Univ, Philadelphia, PA 19122 USA
基金
美国国家科学基金会;
关键词
Nearest neighbor; Affinity graph; Semi-supervised learning; Clustering;
D O I
10.1007/s11263-011-0496-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we study the problem of how to reliably compute neighborhoods on affinity graphs. The k-nearest neighbors (kNN) is one of the most fundamental and simple methods widely used in many tasks, such as classification and graph construction. Previous research focused on how to efficiently compute kNN on vectorial data. However, most real-world data have no vectorial representations, and only have affinity graphs which may contain unreliable affinities. Since the kNN of an object o is a set of k objects with the highest affinities to o, it is easily disturbed by errors in pairwise affinities between o and other objects, and also it cannot well preserve the structure underlying the data. To reliably analyze the neighborhood on affinity graphs, we define the k-dense neighborhood (kDN), which considers all pairwise affinities within the neighborhood, i.e., not only the affinities between o and its neighbors but also between the neighbors. For an object o, its kDN is a set kDN(o) of k objects which maximizes the sum of all pairwise affinities of objects in the set {o}a(a)kDN(o). We analyze the properties of kDN, and propose an efficient algorithm to compute it. Both theoretic analysis and experimental results on shape retrieval, semi-supervised learning, point set matching and data clustering show that kDN significantly outperforms kNN on affinity graphs, especially when many pairwise affinities are unreliable.
引用
收藏
页码:65 / 82
页数:18
相关论文
共 50 条