On the role of message broker middleware for many-task computing on a big-data platform

被引:4
|
作者
Cao Ngoc Nguyen
Jaehwan Lee
Soonwook Hwang
Jik-Soo Kim
机构
[1] University of Science & Technology,Korea Institute of Science and Technology Information
[2] Korea Aerospace University,School of Electronics and Information Engineering
[3] Myongji University,Department of Computer Engineering
来源
Cluster Computing | 2019年 / 22卷
关键词
Many-task computing; Message broker middleware; Hadoop; YARN; ActiveMQ; Kafka; MOHA; Load balancing;
D O I
暂无
中图分类号
学科分类号
摘要
We have designed and implemented a new data processing framework called “Many-task computing On HAdoop” (MOHA) which aims to effectively support fine-grained many-task applications that can show another type of data-intensive workloads in the YARN-based Hadoop 2.0 platform. MOHA is developed as one of Hadoop YARN applications so that it can transparently co-host existing many-task computing (MTC) applications with other data processing workflows such as MapReduce in a single Hadoop cluster. In this paper, we investigate main characteristics of two well-known open-source message broker middleware systems (Apache ActiveMQ and Kafka) and their implications on a many-task management scheme in our MOHA framework. Through our extensive experiments with a real MTC application, we demonstrate and discuss trade-offs between parallelism and load balancing of data access patterns in message broker middleware systems for Many-Task Computing on Hadoop.
引用
收藏
页码:2527 / 2540
页数:13
相关论文
共 50 条
  • [41] Data and Diabetes: Big-Data in Medicine Great Promises and many open Questions
    Ickrath, M.
    DIABETOLOGE, 2016, 12 (08): : 550 - 557
  • [42] Data Transfer Scheduling for Maximizing Throughput of Big-Data Computing in Cloud Systems
    Xie, Ruitao
    Jia, Xiaohua
    IEEE TRANSACTIONS ON CLOUD COMPUTING, 2018, 6 (01) : 87 - 98
  • [43] A "big-data" platform, managing the clinical data & workflows and facilitating clinical research
    Persoon, L.
    Kooy, H.
    Van der Kruijssen, F.
    Doosje, J. W.
    Wolfgang
    RADIOTHERAPY AND ONCOLOGY, 2018, 127 : S596 - S596
  • [44] A fast and low idle time method for mining frequent patterns in distributed and many-task computing environments
    Chun-Cheng Lin
    Sheng-Hao Chung
    Ju-Chin Chen
    Yuan-Tse Yu
    Kawuu W. Lin
    Distributed and Parallel Databases, 2018, 36 : 613 - 641
  • [45] Application Study of Big-data Mining Based on Campus Card Platform
    Li, Shanna
    2016 2ND INTERNATIONAL CONFERENCE ON FUTURE COMPUTER SUPPORTED EDUCATION (FCSE 2016), 2016, : 58 - 60
  • [46] A fast and low idle time method for mining frequent patterns in distributed and many-task computing environments
    Lin, Chun-Cheng
    Chung, Sheng-Hao
    Chen, Ju-Chin
    Yu, Yuan-Tse
    Lin, Kawuu W.
    DISTRIBUTED AND PARALLEL DATABASES, 2018, 36 (04) : 613 - 641
  • [47] CROSS-PLATFORM AVIATION ANALYTICS USING BIG-DATA METHODS
    Larsen, Tulinda
    2013 INTEGRATED COMMUNICATIONS, NAVIGATION AND SURVEILLANCE CONFERENCE (ICNS), 2013,
  • [48] Traffic Information Computing Platform for Big Data
    Duan, Zongtao
    Li, Ying
    Zheng, Xibin
    Liu, Yan
    Dai, Jiting
    Kang, Jun
    INTERNATIONAL CONFERENCE OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING 2014 (ICCMSE 2014), 2014, 1618 : 464 - 467
  • [49] A Campus Big-Data Platform Architecture for Data Mining and Business Intelligence in Education Institutes
    Zhang, Ningcheng
    PROCEEDINGS OF THE 2016 6TH INTERNATIONAL CONFERENCE ON MACHINERY, MATERIALS, ENVIRONMENT, BIOTECHNOLOGY AND COMPUTER (MMEBC), 2016, 88 : 282 - 288
  • [50] Role of big-data in classification and novel class detection in data streams
    Chandak M.B.
    Journal of Big Data, 3 (1)