Large-scale Neural Modeling in MapReduce and Giraph

被引：0

作者：

Yang, Shuo ^{[1
]}

Spielman, Nicholas D. ^{[2
]}

Jackson, Jadin C. ^{[3
]}

Rubin, Brad S. ^{[1
]}

机构：

[1] St Thomas Univ, Grad Programs Software, St Paul, MN 55455 USA

[2] Neurosci Program Univ St Thomas, Minneapolis, MN USA

[3] Univ St Thomas, Dept Biol, Minneapolis, MN USA

来源：

2014 IEEE INTERNATIONAL CONFERENCE ON ELECTRO/INFORMATION TECHNOLOGY (EIT) | 2014年

关键词：

D O I：

暂无

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

One of the most crucial challenges in scientific computing is scalability. Hadoop, an open-source implementation of the MapReduce parallel programming model developed by Google, has emerged as a powerful platform for performing large-scale scientific computing at very low costs. In this paper, we explore the use of Hadoop to model large-scale neural networks. A neural network is most naturally modeled by a graph structure with iterative processing. In this paper, we first present an improved graph algorithm design pattern in MapReduce called Mapper-side Schimmy. Experiments show that the application of our design pattern, combined with the current best practices, can reduce the running time of the neural network simulation on a neural network with 100,000 neurons and 2.3 billion edges by 64%. MapReduce, however, is inherently not efficient for iterative graph processing. To address the limitation of the MapReduce model, we then explore the use of Giraph, an open source large-scale graph processing framework that sits on top of Hadoop to implement graph algorithms with a vertex-centric approach. We show that our Giraph implementation boosted performance by 91% compared to a basic MapReduce implementation and by 60% compared to our improved Mapper-side Schimmy algorithm.

引用

页码：556 / 561

页数：6

共 50 条

[1] Giraph-Based Distributed Algorithms for Coloring Large-Scale Graphs
Brighen, Assia
Chouikh, Asma
Ikhlef, Hamida
Slimani, Hachem
Rezgui, Abdelmounaam
Kheddouci, Hamamache
INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2025, 53 (01)
[2] Large-scale incremental processing with MapReduce
Lee, Daewoo
Kim, Jin-Soo
Maeng, Seungryoul
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2014, 36 : 66 - 79
[3] MapReduce for Large-scale Monitor Data Analyses
Ding, Jianwei
Liu, Yingbo
Zhang, Li
Wang, Jianmin
2014 IEEE 13TH INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (TRUSTCOM), 2014, : 747 - 754
[4] Large-Scale Deep Belief Nets With MapReduce
Zhang, Kunlei
Chen, Xue-Wen
IEEE ACCESS, 2014, 2 : 395 - 403
[5] MapReduce in MPI for Large-scale graph algorithms
Plimpton, Steven J.
Devine, Karen D.
PARALLEL COMPUTING, 2011, 37 (09) : 610 - 632
[6] Large-Scale Frequent Subgraph Mining in MapReduce
Lin, Wenqing
Xiao, Xiaokui
Ghinita, Gabriel
2014 IEEE 30TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2014, : 844 - 855
[7] Large-scale data modeling in Hive and distributed query processing using Mapreduce and Tez
Adamov, Abzetdin
DIVAI 2018: 12TH INTERNATIONAL SCIENTIFIC CONFERENCE ON DISTANCE LEARNING IN APPLIED INFORMATICS, 2018, : 389 - 404
[8] Mining large-scale repetitive sequences in a MapReduce setting
Cao, Hongfei
Phinney, Michael
Petersohn, Devin
Merideth, Benjamin
Shyu, Chi-Ren
INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2016, 14 (03) : 210 - 228
[9] Review of large-scale RDF data processing in mapreduce
Hou, Ke
Zhang, Ming
Fang, Xing
Journal of Software Engineering, 2015, 9 (01): : 195 - 202
[10] Efficient Large-scale Trace Checking Using MapReduce
Bersani, Marcello M.
Bianculli, Domenico
Ghezzi, Carlo
Krstic, Srdan
San Pietro, Pierluigi
2016 IEEE/ACM 38TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE), 2016, : 888 - 898

← 1 2 3 4 5 →