A heterogeneous computing system for data mining workflows

被引:0
|
作者
Luo, Ping
Lu, Kevin
He, Qing
Shi, Zhongzhi
机构
[1] Chinese Acad Sci, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing 100080, Peoples R China
[2] Brunel Univ, Uxbridge UB8 3PH, Middx, England
[3] Chinese Acad Sci, Grad Sch, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The computing-intensive Data Mining (DM) process calls for the support of a Heterogeneous Computing (HC) system, which consists of multiple computers with different configurations, connected by a high-speed LAN, for increased computational power and resources. DM process can be described as a multi-phase pipeline process, and in each phase there could be many optional methods. This makes the workflow of DM very complex and can be modelled only by a Directed Acyclic Graph (DAG). An HC system needs an effective and efficient scheduling framework, which orchestrates all the computing hardware to perform multiple competitive DM workflows. Motivated by the need of a practical solution of the scheduling problem for the DM workflow, this paper proposes a dynamic DAG scheduling algorithm according to the characteristics of execution time estimation model for DM jobs. Based on an approximate estimation of job execution time, this algorithm first maps DM jobs to machines in a decentralized and diligent (defined in this paper) manner. Then the performance of this initial mapping can be improved through job migrations when necessary. The scheduling heuristic used in it considers the factors of both the minimal completion time criterion and the critical path in a DAG. We implement this system in an established Multi-Agent System (MAS) environment, in which the reuse of existing DM algorithms is achieved by encapsulating them into agents. Practical classification problems are used to test and measure the system performance. The detailed experiment procedure and result analysis are also discussed in this paper.
引用
收藏
页码:177 / 189
页数:13
相关论文
共 50 条
  • [21] Fault Tolerant and Data Oriented Scientific Workflows Management and Scheduling System in Cloud Computing
    Ahmad, Zulfiqar
    Jehangiri, Ali Imran
    Mohamed, Nader
    Othman, Mohamed
    Umar, Arif Iqbal
    IEEE ACCESS, 2022, 10 : 77614 - 77632
  • [22] Examination System in the Cloud Computing Platform based on Data Mining
    Li Xiao-Feng
    Wang Jian-Hua
    Gao Wei-Wei
    PROCEEDINGS 2013 INTERNATIONAL CONFERENCE ON MECHATRONIC SCIENCES, ELECTRIC ENGINEERING AND COMPUTER (MEC), 2013, : 1605 - 1608
  • [23] Application of Cloud Computing and Data Mining in Smart Tourism System
    Fan Tongke
    PROCEEDINGS OF THE SECOND INTERNATIONAL SYMPOSIUM - MANAGEMENT, INNOVATION AND DEVELOPMENT, 2015, : 141 - 144
  • [24] Mobile Data Mining System based-on Cloud Computing
    Huang, Zhirui
    He, Xiaxu
    Liu, Pengfei
    Chen, Yanhua
    Zhang, Weifeng
    2018 IEEE 3RD INTERNATIONAL CONFERENCE ON SIGNAL AND IMAGE PROCESSING (ICSIP), 2018, : 557 - 561
  • [25] Minimal Start Time Heuristics for Scheduling Workflows in Heterogeneous Computing Systems
    Sirisha, D.
    VijayaKumari, G.
    DISTRIBUTED COMPUTING AND INTERNET TECHNOLOGY (ICDCIT 2016), 2016, 9581 : 199 - 212
  • [26] Flexible immune network recognition system for mining heterogeneous data
    Puteh, Mazidah
    Hamdan, Abdul Razak
    Omar, Khairuddin
    Abu Bakar, Azuraliza
    ARTIFICIAL IMMUNE SYSTEMS, PROCEEDINGS, 2008, 5132 : 232 - +
  • [27] AMETHYST: A System for Mining and Exploring Topical Hierarchies of Heterogeneous Data
    Danilevsky, Marina
    Wang, Chi
    Tao, Fangbo
    Nguyen, Son
    Chen, Gong
    Desai, Nihit
    Wang, Lidan
    Han, Jiawei
    19TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'13), 2013, : 1458 - 1461
  • [28] Design of Data Reconciliation System Based on FPGA Heterogeneous Computing
    Liu Jiasen
    Guo Dabo
    Guo Tianhao
    Li Xianzhong
    Wang Yujie
    Meng Yingxiu
    ACTA OPTICA SINICA, 2023, 43 (02)
  • [29] Relational graphical models of computational workflows for data mining
    Hsu, WH
    SEMANTICS OF A NETWORKED WORLD: SEMANTICS FOR GRID DATABASES, 2004, 3226 : 309 - 310
  • [30] ClowdFlows: Online workflows for distributed big data mining
    Kranjc, Janez
    Orac, Roman
    Podpecan, Vid
    Lavrac, Nada
    Robnik-Sikonja, Marko
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2017, 68 : 38 - 58