Efficient Distributed k-Clique Mining for Large Networks Using MapReduce

被引:0
|
作者
Shahrivari, Saeed [1 ]
Jalili, Saeed [1 ]
机构
[1] Tarbiat Modares Univ, Tehran 14115111, Iran
关键词
Cloud computing; Data mining; Memory management; Task analysis; Social networking (online); Bioinformatics; Multicore processing; k-clique mining; MapReduce algorithms; parallel graph; COMMUNITY STRUCTURE;
D O I
10.1109/TKDE.2019.2936027
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Mining cliques of a network is an important problem that has many applications in different fields like social networks, bioinformatics, and web analysis. In most applications, mining fixed sized cliques, known as k-cliques, is enough. However, mining cliques of a large network is very challenging using current solutions, and it takes a considerable time using a commodity machine. Also, very large networks cannot be efficiently loaded into memory of a single machine. To overcome these limitations, we have proposed a solution named KCminer, which is based on state space search and can be totally fitted into the MapReduce framework. Using the MapReduce framework, it is possible to run KCminer on cloud computing platforms and hence, process very large networks in feasible time. Our experiments which were performed on a cloud computing platform with 100 machines show that KCminer is both fast and scalable. Besides the MapReduce framework, KCminer executes efficiently on parallel shared memory systems. We performed some experiments on a commodity multicore desktop and showed that KCminer can effectively use the power of all cores. The experimental results show that even using a single thread, KCminer is much faster than available serial tools like MACE.
引用
收藏
页码:964 / 974
页数:11
相关论文
共 50 条
  • [31] An Architecture of Distributed Beta Wavelet Networks for large image classification on in MapReduce
    Sakkari, Mohamed
    Zaied, Mourad
    2015 15TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS (ISDA), 2015, : 703 - 707
  • [32] HDFS Framework for Efficient Frequent Itemset Mining Using MapReduce
    Kulkarni, Prajakta G.
    Khonde, Shraddha R.
    2017 1ST INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND INFORMATION MANAGEMENT (ICISIM), 2017, : 171 - 178
  • [33] Parallel Distributed Trajectory Pattern Mining Using Hierarchical Grid with MapReduce
    Seki, Kazuhiro
    Jinno, Ryota
    Uehara, Kuniaki
    INTERNATIONAL JOURNAL OF GRID AND HIGH PERFORMANCE COMPUTING, 2013, 5 (04) : 79 - 96
  • [34] Finding a maximum k-club using the k-clique formulation and canonical hypercube cuts (vol 12, pg 1947, 2018)
    Lu, Yajun
    Moradi, Esmaeel
    Balasundaram, Balabhaskar
    OPTIMIZATION LETTERS, 2018, 12 (08) : 1959 - 1969
  • [35] An efficient and effective approach for mining a group stock portfolio using mapreduce
    Chen, Chun-Hao
    Chen, Chao-Chun
    Nojima, Yusuke
    INTELLIGENT DATA ANALYSIS, 2017, 21 : S217 - S232
  • [36] Multiple Object Tracking in Sensor Networks using Distributed Clique Finding
    Javed, Nauman
    Wolf, Tilman
    2013 INTERNATIONAL CONFERENCE ON COMPUTING, NETWORKING AND COMMUNICATIONS (ICNC), 2013,
  • [37] A Distributed Look-up Architecture for Text Mining Applications using MapReduce
    Balkir, Atilla Soner
    Foster, Ian
    Rzhetsky, Andrey
    HPDC 11: PROCEEDINGS OF THE 20TH INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE DISTRIBUTED COMPUTING, 2011, : 279 - 280
  • [38] Large-Scale Multimedia Data Mining Using MapReduce Framework
    Wang, Hanli
    Shen, Yun
    Wang, Lei
    Zhufeng, Kuangtian
    Wang, Wei
    Cheng, Cheng
    2012 IEEE 4TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING TECHNOLOGY AND SCIENCE (CLOUDCOM), 2012,
  • [39] Multi-dimensional geospatial data mining in a distributed environment using MapReduce
    Alkathiri, Mazin
    Jhummarwala, Abdul
    Potdar, M. B.
    JOURNAL OF BIG DATA, 2019, 6 (01)
  • [40] Multi-dimensional geospatial data mining in a distributed environment using MapReduce
    Mazin Alkathiri
    Abdul Jhummarwala
    M. B. Potdar
    Journal of Big Data, 6