qEndpoint: A novel triple store architecture for large RDF graphs

被引:0
|
作者
Willerval, Antoine [1 ,2 ]
Diefenbach, Dennis [1 ]
Bonifati, Angela [2 ]
机构
[1] QA Co, St Etienne, France
[2] Lyon 1 Univ, CNRS, Liris, IUF, Villeurbanne, France
关键词
RDF; qEndpoint; HDT; RDF4J; Wikidata; ENGINE;
D O I
10.3233/SW-243616
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the relational database realm, there has been a shift towards novel hybrid database architectures combining the properties of transaction processing (OLTP) and analytical processing (OLAP). OLTP workloads are made up by read and write operations on a small number of rows and are typically addressed by indexes such as B+trees. On the other side, OLAP workloads consists of big read operations that scan larger parts of the dataset. To address both workloads some databases introduced an architecture using a buffer or delta partition. Precisely, changes are accumulated in a write-optimized delta partition while the rest of the data is compressed in the read- optimized main partition. Periodically, the delta storage is merged in the main partition. In this paper we investigate for the first time how this architecture can be implemented and behaves for RDF graphs. We describe in detail the indexing-structures one can use for each partition, the merge process as well as the transactional management. We study the performances of our triple store, which we call qEndpoint, over two popular benchmarks, the Berlin SPARQL Benchmark (BSBM) and the recent Wikidata Benchmark (WDBench). We are also studying how it compares against other public Wikidata endpoints. This allows us to study the behavior of the triple store for different workloads, as well as the scalability over large RDF graphs. The results show that, compared to the baselines, our triple store allows for improved indexing times, better response time for some queries, higher insert and delete rates, and low disk and memory footprints, making it ideal to store and serve large Knowledge Graphs.
引用
收藏
页码:2069 / 2087
页数:19
相关论文
共 50 条
  • [21] VEDAS: an efficient GPU alternative for store and query of large RDF data sets
    Makpaisit, Pisit
    Chantrapornchai, Chantana
    JOURNAL OF BIG DATA, 2021, 8 (01)
  • [22] VEDAS: an efficient GPU alternative for store and query of large RDF data sets
    Pisit Makpaisit
    Chantana Chantrapornchai
    Journal of Big Data, 8
  • [23] Generate and Update Large HDT RDF Knowledge Graphs on Commodity Hardware
    Willerval, Antoine
    Diefenbach, Dennis
    Bonifati, Angela
    SEMANTIC WEB, PT II, ESWC 2024, 2024, 14665 : 128 - 144
  • [24] Parallel Processing SPARQL Theta Join on Large Scale RDF Graphs
    Wang, Tao
    Yuan, Pingpeng
    Liao, Xiaofei
    Jin, Hai
    2018 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2018,
  • [25] RDFPath: Path Query Processing on Large RDF Graphs with Map Reduce
    Przyjaciel-Zablocki, Martin
    Schaetzle, Alexander
    Hornung, Thomas
    Lausen, Georg
    SEMANTIC WEB: ESWC 2011 WORKSHOPS, 2012, 7117 : 50 - 64
  • [26] Random Indexing for Finding Similar Nodes within Large RDF Graphs
    Damljanovic, Danica
    Petrak, Johann
    Lupu, Mihai
    Cunningham, Hamish
    Carlsson, Mats
    Engstrom, Gunnar
    Andersson, Bo
    SEMANTIC WEB: ESWC 2011 WORKSHOPS, 2012, 7117 : 156 - +
  • [27] An evaluation of triple-store technologies for large data stores
    Rohloff, Kurt
    Dean, Mike
    Emmons, Ian
    Ryder, Dorene
    Sumner, John
    ON THE MOVE TO MEANINGFUL INTERNET SYSTEMS 2007: OTM 2007 WORKSHOPS, PT 2, PROCEEDINGS, 2007, 4806 : 1105 - 1114
  • [28] Longest Path Subgraph: A Novel and Efficient Algorithm to Match RDF Graphs
    Gutierrez-Soto, Claudio
    Campos, Pedro G.
    Aguila, Julio
    NINTH MEXICAN INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE, PROCEEDINGS, 2008, : 232 - +
  • [29] H2RDF+: High-performance Distributed Joins over Large-scale RDF Graphs
    Papailiou, Nikolaos
    Konstantinou, Ioannis
    Tsoumakos, Dimitrios
    Karras, Panagiotis
    Koziris, Nectarios
    2013 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2013,
  • [30] Intermediate triple table: A general architecture for virtual knowledge graphs
    Arenas-Guerrero, Julian
    Corcho, Oscar
    Perez, Maria S.
    KNOWLEDGE-BASED SYSTEMS, 2025, 314