Optimizing Distributed Join for Array Database System

被引:0
|
作者
Li, Jing [1 ]
Li, Hui [1 ]
Chen, Mei [1 ]
Zhu, Ming [2 ]
机构
[1] Guizhou Univ, Guizhou Engn Lab ACMIS, Guiyang, Guizhou, Peoples R China
[2] Chinese Acad Sci, Natl Astron Observ, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
distributed array database; join algorithm; network overhead; CPU cost;
D O I
10.1109/ITME.2016.127
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
With the sustained and rapid development of science and technology, the explosion of scientific data for analysis has brought the huge pressure. In order to reduce pressure, scientists use the array database instead of RDBMS to store and manage the scientific data. But according to our experiments, we find that the array database outperforms RDBMS on the simple queries but it can't support the complex multi-table join query very well. And because the network communication is the slowest component of multi-table join queries in distributed parallel databases, we introduce an optimized join algorithm that not only can minimize network communication by optimizing the transfer schedule, but also can reduce the CPU utilization, prevent it to become the bottleneck for the intensive computations. Our evaluation based on real scientific data and database shows the optimized algorithm adapts to diverse datasets and query types and it makes the array database outperforms RDBMS on multi-table queries of real workloads.
引用
收藏
页码:640 / 644
页数:5
相关论文
共 50 条
  • [21] A general fragments allocation method for join query in distributed database
    Gao, Jintao
    Liu, Wenjie
    Li, Zhanhuai
    Zhang, Jian
    Shen, Li
    INFORMATION SCIENCES, 2020, 512 : 1249 - 1263
  • [22] GUC-Secure Join Operator in Distributed Relational Database
    Tian, Yuan
    Zhang, Hao
    INFORMATION AND COMMUNICATIONS SECURITY, PROCEEDINGS, 2009, 5927 : 370 - 384
  • [23] A new fragments allocating method for join query in distributed database
    Gao, Jintao
    Li, Zhanhuai
    Liu, Wenjie
    Guo, Zhijun
    Yue, Yantao
    FRONTIERS OF COMPUTER SCIENCE, 2020, 14 (04)
  • [24] Join query optimization in the distributed database system using an artificial bee colony algorithm and genetic operators
    Panahi, Vahideh
    Navimipour, Nima Jafari
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2019, 31 (17):
  • [25] Distributed Top-K Join Queries Optimizing for RDF Datasets
    Gu, Jinguang
    Dong, Hao
    Liu, Zhao
    Xu, Fangfang
    INTERNATIONAL JOURNAL OF WEB SERVICES RESEARCH, 2017, 14 (03) : 67 - 83
  • [26] Optimizing cyclic join view maintenance over distributed data sources
    Liu, B
    Rundensteiner, EA
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2006, 18 (03) : 363 - 376
  • [27] Plexus: Optimizing Join Approximation for Geo-Distributed Data Analytics
    Wolfrath, Joel
    Chandra, Abhishek
    PROCEEDINGS OF THE 2023 ACM SYMPOSIUM ON CLOUD COMPUTING, SOCC 2023, 2023, : 1 - 16
  • [28] A Study of Optimized Algorithm for Distributed Database Half-join Query
    Yu, Xiuxia
    2011 INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND MULTIMEDIA COMMUNICATION, 2011, : 356 - 359
  • [29] Thorough Data Pruning for Join Query in Database System
    Gao, Jintao
    Li, Zhanhuai
    Sun, Jian
    IEEE TRANSACTIONS ON SUSTAINABLE COMPUTING, 2024, 9 (03): : 409 - 421
  • [30] Presenting a New Method for Optimizing Join Queries Processing in Heterogeneous Distributed Databases
    Zafarani, Elnaz
    Derakhshi, Mohammad Reza Feizi
    Asil, Hasan
    Asil, Amir
    THIRD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING: WKDD 2010, PROCEEDINGS, 2010, : 379 - 382