Reducing data transfer in big-data workflows: the computation-flow delegated approach

被引:0
|
作者
Rickey T. P. Nunes
Santosh L. Deshpande
机构
[1] Government Polytechnic,Department of Computer Engineering
[2] Visvesvaraya Technological University,Centre for Postgraduate Studies
来源
关键词
Big-data; Bioinformatics; Orchestration; Workflow; Mobile agents; Computation-flow;
D O I
10.1007/s42488-019-00012-z
中图分类号
学科分类号
摘要
Existing orchestrated bioinformatics workflow execution approaches necessitate the transfer of datasets from biological data services to the analysis tool (computation) services of the workflow for various data analysis. This model of moving data to computation during workflow execution weakens the performance of the workflow especially when the orchestrated bioinformatics workflow has to handle big-data in it. Since the size of the analysis tools are much smaller than the datasets size in a workflow, in this paper, to minimize the dataflow and improve workflow performance, we propose a novel computation-flow delegated (CFD) approach. The CFD approach lets the tool services of the workflow to dynamically migrate analysis tools towards the datasets to perform computation on data side during workflow execution. We use a set of mobile agents to operate the CFD approach and present a mobile agent-based computation-flow delegation framework (MABCFD) to execute the workflow tasks. We implement the prototype of the MABCFD framework and analyze the performance of the CFD approach empirically by executing in isolation workflow patterns (sequence, fan-out and fan-in) common to bioinformatics applications. Performance analysis shows that the computation-driven CFD approach consistently outperforms the existing data-driven approaches across all patterns and scales favorably with data size.
引用
收藏
页码:129 / 145
页数:16
相关论文
共 50 条
  • [1] Big-Data Approaches for Bioinformatics Workflows: A Comparative Assessment
    Nunes, Rickey T. P.
    Deshpande, Santosh L.
    SMART TRENDS IN INFORMATION TECHNOLOGY AND COMPUTER COMMUNICATIONS, SMARTCOM 2016, 2016, 628 : 647 - 654
  • [2] Dynamic Data Placement and Tool Assignment for Big-data Orchestrated Bioinformatics Workflows
    Nunes, Rickey T. P.
    Deshpande, Santosh L.
    2017 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2017, : 1 - 6
  • [3] A "big-data" platform, managing the clinical data & workflows and facilitating clinical research
    Persoon, L.
    Kooy, H.
    Van der Kruijssen, F.
    Doosje, J. W.
    Wolfgang
    RADIOTHERAPY AND ONCOLOGY, 2018, 127 : S596 - S596
  • [4] A Minimax Approach for Classification with Big-data
    Krishnan, R.
    Jagannathan, S.
    Samaranayake, V. A.
    2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 1437 - 1444
  • [5] Pipelined data-flow delegated orchestration for data-intensive eScience workflows
    Subramanian, Sattanathan
    Sztromwasser, Pawel
    Puntervoll, Pal
    Petersen, Kjell
    INTERNATIONAL JOURNAL OF WEB INFORMATION SYSTEMS, 2013, 9 (03) : 204 - +
  • [6] A Big-Data Approach to Contemporary French Politics
    Sobanet, Andrew
    Singh, Lisa
    CONTEMPORARY FRENCH AND FRANCOPHONE STUDIES, 2020, 24 (05) : 625 - 634
  • [7] Memristor: The Enabler of Computation-in-Memory Architecture for Big-Data
    Hamdioui, Said
    Taouil, Mottaqiallah
    Hoang Anh Du Nguyen
    Haron, Adib
    Xie, Lei
    Bertels, Koen
    2015 INTERNATIONAL CONFERENCE ON MEMRISTIVE SYSTEMS (MEMRISYS), 2015,
  • [8] Optimizing Read-Once Data Flow in Big-Data Applications
    Morad, Tomer Y.
    Shomron, Gil
    Erez, Mattan
    Kolodny, Avinoam
    Weiser, Uri C.
    IEEE COMPUTER ARCHITECTURE LETTERS, 2017, 16 (01) : 68 - 71
  • [9] Big-Data Visualization
    Keim, Daniel
    Qu, Huamin
    Ma, Kwan-Liu
    IEEE COMPUTER GRAPHICS AND APPLICATIONS, 2013, 33 (04) : 20 - 21
  • [10] Data Transfer Scheduling for Maximizing Throughput of Big-Data Computing in Cloud Systems
    Xie, Ruitao
    Jia, Xiaohua
    IEEE TRANSACTIONS ON CLOUD COMPUTING, 2018, 6 (01) : 87 - 98