Cluster-Based Joins for Federated SPARQL Queries

被引:0
|
作者
Yang, Fan [1 ]
Crainiceanu, Adina [2 ]
Chen, Zhiyuan [1 ]
Needham, Don [2 ]
机构
[1] Univ Maryland, Baltimore, MD 21250 USA
[2] United States Naval Acad, Annapolis, MD 21402 USA
关键词
Clustering algorithms; Resource description framework; Costs; Distributed databases; Seaports; Pattern matching; Marine vehicles; RDF; SPARQL; federated queries; join; cluster; SYSTEM; RDF;
D O I
10.1109/TKDE.2021.3135507
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Federated RDF systems allow users to retrieve data from multiple independent sources without needing to have all the data in the same triple store. The performance of these systems can be poor for large and geographically distributed RDF data where network transfer costs are high. This article introduces CBTP-OL and CBTP-Nhop, two novel join algorithms that take advantage of network topology to decrease the cost of processing Basic Graph Pattern (BGP) SPARQL queries in a geographically distributed environment. Federation members are grouped in clusters, based on the network communication cost between the members, and the bulk of the join processing is pushed to the clusters. Our CBTP-OL and CBTL-Nhop algorithms use an overlap list and, respectively, an N-hop overlap list, to efficiently compute join results from triples in different clusters. We implement our algorithms in the OpenRDF Sesame federated framework and use Apache Rya triple store instances as federation members. Experimental evaluation results show the advantages of our approach over existing techniques.
引用
收藏
页码:3525 / 3539
页数:15
相关论文
共 50 条
  • [1] Fast Interval Joins for Temporal SPARQL Queries
    Chekol, Melisachew Wudage
    Pirro, Giuseppe
    Stuckenschmidt, Heiner
    COMPANION OF THE WORLD WIDE WEB CONFERENCE (WWW 2019 ), 2019, : 1148 - 1154
  • [2] Result Optimisation for Federated SPARQL Queries
    Fatima, Arooj
    Luca, Cristina
    Wilson, George
    Kettouch, Mohamed
    2015 17TH UKSIM-AMSS INTERNATIONAL CONFERENCE ON COMPUTER MODELLING AND SIMULATION (UKSIM), 2015, : 491 - 496
  • [3] Optimizing Graph Queries with Graph Joins and Sprinkle SPARQL
    Goodman, Eric L.
    Jimenez, Edward
    al-Saffar, Sinan
    Joslyn, Cliff
    Haglin, David
    Grunwald, Dirk
    2014 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2014,
  • [4] PFed: Recommending Plausible Federated SPARQL Queries
    Hacques, Florian
    Skaf-Molli, Hala
    Molli, Pascal
    Hassad, Sara E. L.
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, PT II, 2019, 11707 : 184 - 197
  • [5] Federated SPARQL Queries Processing with Replicated Fragments
    Montoya, Gabriela
    Skaf-Molli, Hala
    Molli, Pascal
    Vidal, Maria-Esther
    SEMANTIC WEB - ISWC 2015, PT I, 2015, 9366 : 36 - 51
  • [6] The Odyssey Approach for Optimizing Federated SPARQL Queries
    Montoya, Gabriela
    Skaf-Molli, Hala
    Hose, Katja
    SEMANTIC WEB - ISWC 2017, PT I, 2017, 10587 : 471 - 489
  • [7] Strategies for Executing Federated Queries in SPARQL1.1
    Buil-Aranda, Carlos
    Polleres, Axel
    Umbrich, Juergen
    SEMANTIC WEB - ISWC 2014, PT II, 2014, 8797 : 390 - 405
  • [8] Parallelizing Federated SPARQL Queries in Presence of Replicated Data
    Minier, Thomas
    Montoya, Gabriela
    Skaf-Molli, Hala
    Molli, Pascal
    SEMANTIC WEB: ESWC 2017 SATELLITE EVENTS, 2017, 10577 : 181 - 196
  • [9] Cluster-Based Secure Aggregation for Federated Learning
    Kim, Jien
    Park, Gunryeong
    Kim, Miseung
    Park, Soyoung
    ELECTRONICS, 2023, 12 (04)
  • [10] Performance evaluation of cluster-based federated machine learning
    Karim Asif Sattar
    Uthman Baroudi
    Neural Computing and Applications, 2024, 36 : 7657 - 7668