Optimization of sub-query processing in distributed data integration systems

被引:10
|
作者
Chen, Gang [1 ]
Wu, Yongwei [1 ]
Liu, Jia [1 ]
Yang, Guangwen [1 ]
Zheng, Weimin [1 ]
机构
[1] Tsinghua Univ, Dept Comp Sci & Technol, Tsinghua Natl Lab Informat Sci & Technol, Beijing 100084, Peoples R China
关键词
Cloud computing; Grid computing; Data integration; Query; Data flow;
D O I
10.1016/j.jnca.2010.06.007
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Data integration system (DIS) is becoming paramount when Cloud/Grid applications need to integrate and analyze data from geographically distributed data sources. DIS gathers data from multiple remote sources, integrates and analyzes the data to obtain a query result. As Clouds/Grids are distributed over wide-area networks, communication cost usually dominates overall query response time. Therefore we can expect that query performance can be improved by minimizing communication cost. In our method, DIS uses a data flow style query execution model. Each query plan is mapped to a group of mu Engines, each of which is a program corresponding to a particular operator. Thus, multiple sub-queries from concurrent queries are able to share mu Engines. We reconstruct these sub-queries to exploit overlapping data among them. As a result, all the sub-queries can obtain their results, and overall communication overhead can be reduced. Experimental results show that, when DIS runs a group of parameterized queries, our reconstructing algorithm can reduce the average query completion time by 32-48%; when DIS runs a group of non-parameterized queries, the average query completion time of queries can be reduced by 25-35%. (C) 2010 Elsevier Ltd. All rights reserved.
引用
收藏
页码:1035 / 1042
页数:8
相关论文
共 50 条
  • [1] Adaptive Caching Using Sub-query Fragmentation for Reduction in Data Transfers from Distributed Databases
    Venkata, Santhilata Kuppili
    Keppens, Jeroen
    Musial, Katarzyna
    ASTRONOMICAL DATA ANALYSIS SOFTWARE AND SYSTEMS XXV, 2017, 512 : 85 - 88
  • [2] Query Processing and Optimization in Distributed Database Systems
    Alom, B. M. Monjurul
    Henskens, Frans
    Hannaford, Michael
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2009, 9 (09): : 143 - 152
  • [3] Integrated query management system IQMS - (Control of parallel execution of sub-query of database on distributed system)
    Sitohang, B
    CCCT 2003, VOL 5, PROCEEDINGS: COMPUTER, COMMUNICATION AND CONTROL TECHNOLOGIES: II, 2003, : 198 - 202
  • [4] Geospatial Extension on Data Grid Integration and Distributed Query Processing
    Gao, Ang
    Chen, Rongguo
    2010 18TH INTERNATIONAL CONFERENCE ON GEOINFORMATICS, 2010,
  • [5] Data modeling and query processing for distributed surveillance systems
    Nam, Yunyoung
    Hong, Sangjin
    Rho, Seungmin
    NEW REVIEW OF HYPERMEDIA AND MULTIMEDIA, 2013, 19 (3-4) : 299 - 327
  • [6] QUERY OPTIMIZATION IN DISTRIBUTED DATA-BASE SYSTEMS
    SACCO, GM
    YAO, SB
    ADVANCES IN COMPUTERS, 1982, 21 : 225 - 273
  • [7] Query Optimization for Distributed Spatio-Temporal Sensing Data Processing
    Li, Xin
    Yu, Huayan
    Yuan, Ligang
    Qin, Xiaolin
    SENSORS, 2022, 22 (05)
  • [8] A Query Engine for Distributed Query Processing on Linked Data
    Magalhaes, Regis Pires
    Monteiro, Jose Maria
    Vidal, Vania M. P.
    de Macedo, Jose A. F.
    Maia, Macedo
    Porto, Fabio
    Casanova, Marco A.
    ICEIS: PROCEEDINGS OF THE 15TH INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS, VOL 1, 2013, : 185 - 192
  • [9] Distributed Query Processing and Data Sharing
    Roy, Ahana
    Olmsted, Aspen
    2017 12TH INTERNATIONAL CONFERENCE FOR INTERNET TECHNOLOGY AND SECURED TRANSACTIONS (ICITST), 2017, : 221 - 224
  • [10] QUERY PROCESSING IN DISTRIBUTED DATABASE SYSTEMS
    HEVNER, AR
    YAO, SB
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1979, 5 (03) : 177 - 187