Optimizing Distributed Join for Array Database System

被引:0
|
作者
Li, Jing [1 ]
Li, Hui [1 ]
Chen, Mei [1 ]
Zhu, Ming [2 ]
机构
[1] Guizhou Univ, Guizhou Engn Lab ACMIS, Guiyang, Guizhou, Peoples R China
[2] Chinese Acad Sci, Natl Astron Observ, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
distributed array database; join algorithm; network overhead; CPU cost;
D O I
10.1109/ITME.2016.127
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
With the sustained and rapid development of science and technology, the explosion of scientific data for analysis has brought the huge pressure. In order to reduce pressure, scientists use the array database instead of RDBMS to store and manage the scientific data. But according to our experiments, we find that the array database outperforms RDBMS on the simple queries but it can't support the complex multi-table join query very well. And because the network communication is the slowest component of multi-table join queries in distributed parallel databases, we introduce an optimized join algorithm that not only can minimize network communication by optimizing the transfer schedule, but also can reduce the CPU utilization, prevent it to become the bottleneck for the intensive computations. Our evaluation based on real scientific data and database shows the optimized algorithm adapts to diverse datasets and query types and it makes the array database outperforms RDBMS on multi-table queries of real workloads.
引用
收藏
页码:640 / 644
页数:5
相关论文
共 50 条