Solutions for Processing K Nearest Neighbor Joins for Massive Data on MapReduce

被引:16
|
作者
Song, Ge [1 ,2 ]
Rochas, Justine [1 ]
Huet, Fabrice [1 ]
Magoules, Frederic [2 ]
机构
[1] Univ Nice Sophia Antipolis, CNRS, I3S, UMR 7271, F-06900 Sophia Antipolis, France
[2] Ecole Cent Paris, Chatenay Malabry, France
来源
23RD EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED, AND NETWORK-BASED PROCESSING (PDP 2015) | 2015年
关键词
kNN Join; Data Partition; Hadoop; MapReduce; SEARCH;
D O I
10.1109/PDP.2015.79
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Given a point p and a set of points S, the kNN operation finds the k closest points to p in S. It is a computational intensive task with a large range of applications such as knowledge discovery or data mining. However, as the volume and the dimension of data increase, only distributed approaches can perform such costly operation in a reasonable time. Recent works have focused on implementing efficient solutions using the MapReduce programming model because it is suitable for large scale data processing. Also, it can easily be executed in a distributed environment. Although these works provide different solutions to the same problem, each one has particular constraints and properties. There is no readily available comparison to help users choose the one most appropriate for their needs. This is the problem we address in this work. Firstly, we show that all kNN implementations go through a common workflow, which we use as a basis for classification. Secondly, we describe precisely the different techniques published so far. And lastly, we provide a set of objective criteria that can be used to make informed decisions.
引用
收藏
页码:279 / 287
页数:9
相关论文
共 50 条
  • [1] Efficient Processing of k Nearest Neighbor Joins using MapReduce
    Lu, Wei
    Shen, Yanyan
    Chen, Su
    Ooi, Beng Chin
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2012, 5 (10): : 1016 - 1027
  • [2] Parallel Computation of k-Nearest Neighbor Joins Using MapReduce
    Kim, Wooyeol
    Kim, Younghoon
    Shim, Kyuseok
    2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2016, : 696 - 705
  • [3] On reverse-k-nearest-neighbor joins
    Tobias Emrich
    Hans-Peter Kriegel
    Peer Kröger
    Johannes Niedermayer
    Matthias Renz
    Andreas Züfle
    GeoInformatica, 2015, 19 : 299 - 330
  • [4] On reverse-k-nearest-neighbor joins
    Emrich, Tobias
    Kriegel, Hans-Peter
    Kroeger, Peer
    Niedermayer, Johannes
    Renz, Matthias
    Zuefle, Andreas
    GEOINFORMATICA, 2015, 19 (02) : 299 - 330
  • [5] K Nearest Neighbour Joins for Big Data on MapReduce: A Theoretical and Experimental Analysis
    Song, Ge
    Rochas, Justine
    El Beze, Lea
    Huet, Fabrice
    Magoules, Frederic
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2016, 28 (09) : 2376 - 2392
  • [7] Map Reduce by K-Nearest Neighbor Joins
    Bethu, Srikanth
    Babu, B. Sankara
    Rao, S. Govinda
    Florence, R. Aruna
    2018 INTERNATIONAL CONFERENCE ON CYBER-ENABLED DISTRIBUTED COMPUTING AND KNOWLEDGE DISCOVERY (CYBERC 2018), 2018, : 222 - 231
  • [8] Efficient processing of all-k-nearest-neighbor queries in the MapReduce programming framework
    Moutafis, Panagiotis
    Mavrommatis, George
    Vassilakopoulos, Michael
    Sioutas, Spyros
    DATA & KNOWLEDGE ENGINEERING, 2019, 121 : 42 - 70
  • [9] A MapReduce-based k-Nearest Neighbor Approach for Big Data Classification
    Maillo, Jesus
    Triguero, Isaac
    Herrera, Francisco
    2015 IEEE TRUSTCOM/BIGDATASE/ISPA, VOL 2, 2015, : 167 - 172
  • [10] The k-Nearest Neighbor Algorithm Using MapReduce Paradigm
    Anchalia, Prajesh P.
    Roy, Kaushik
    PROCEEDINGS FIFTH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS, MODELLING AND SIMULATION, 2014, : 513 - 518