Scalable Supergraph Search in Large Graph Databases

被引:0
|
作者
Lyu, Bingqing [1 ]
Qin, Lu [2 ]
Lin, Xuemin [1 ,3 ]
Chang, Lijun [3 ]
Yu, Jeffrey Xu [4 ]
机构
[1] East China Normal Univ, Shanghai, Peoples R China
[2] Univ Technol Sydney, Ctr Quantum Computat & Intelligent Syst, Sydney, NSW, Australia
[3] Univ New South Wales, Sydney, NSW, Australia
[4] Chinese Univ Hong Kong, Hong Kong, Hong Kong, Peoples R China
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Supergraph search is a fundamental problem in graph databases that is widely applied in many application scenarios. Given a graph database and a query-graph, supergraph search retrieves all data-graphs contained in the query-graph from the graph database. Most existing solutions for supergraph search follow the pruning-and-verification framework, which prunes false answers based on features in the pruning phase and performs subgraph isomorphism testings on the remaining graphs in the verification phase. However, they are not scalable to handle large-sized data-graphs and query-graphs due to three drawbacks. First, they rely on a frequent subgraph mining algorithm to select features which is expensive and cannot generate large features. Second, they require a costly verification phase. Third, they process features in a fixed order without considering their relationship to the query-graph. In this paper, we address the three drawbacks and propose new indexing and query processing algorithms. In indexing, we select features directly from the data-graphs without expensive frequent subgraph mining. The features form a feature-tree that contains all-sized features and both the cost sharing and pruning power of the features are considered. In query processing, we propose a verification-free algorithm, where the order to process features is query-dependent by considering both the cost sharing and the pruning power. We explore two optimization strategies to further improve the algorithm efficiency. The first strategy applies a lightweight graph compression technique and the second strategy optimizes the inclusion of answers. Finally, we conduct extensive performance studies on two real large datasets to demonstrate the high scalability of our algorithms.
引用
收藏
页码:157 / 168
页数:12
相关论文
共 50 条
  • [41] Efficient search in graph databases using cross filtering
    Lee, Chun-Hee
    Chung, Chin-Wan
    INFORMATION SCIENCES, 2014, 286 : 1 - 18
  • [42] Authenticated Subgraph Similarity Search in Outsourced Graph Databases
    Peng, Yun
    Fan, Zhe
    Choi, Byron
    Xu, Jianliang
    Bhowmick, Sourav S.
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2015, 27 (07) : 1838 - 1860
  • [43] GString: A novel approach for efficient search in graph databases
    Jiang, Haoliang
    Wang, Haixun
    Yu, Philip S.
    Zhou, Shuigeng
    2007 IEEE 23RD INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2007, : 541 - +
  • [44] Scalable top-k keyword search in relational databases
    Xu, Yanwei
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2019, 22 (Suppl 1): : 731 - 747
  • [45] Scalable top-k keyword search in relational databases
    Yanwei Xu
    Cluster Computing, 2019, 22 : 731 - 747
  • [46] Match Graph Construction for Large Image Databases
    Kim, Kwang In
    Tompkin, James
    Theobald, Martin
    Kautz, Jan
    Theobalt, Christian
    COMPUTER VISION - ECCV 2012, PT I, 2012, 7572 : 272 - 285
  • [47] Top-K Correlation Sub-graph Search in Graph Databases
    Zou, Lei
    Chen, Lei
    Lu, Yansheng
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PROCEEDINGS, 2009, 5463 : 168 - +
  • [48] The order supergraph of the power graph of a finite group
    Hamzeh, Asma
    Ashrafi, Ali Reza
    TURKISH JOURNAL OF MATHEMATICS, 2018, 42 (04) : 1978 - 1989
  • [49] On the order supergraph of the power graph of a finite group
    Ma, Xuanlong
    Su, Huadong
    RICERCHE DI MATEMATICA, 2022, 71 (02) : 381 - 390
  • [50] SQUID: A Scalable System for Querying, Updating and Indexing Dynamic Graph Databases
    Kansal, Akshay
    Spezzano, Francesca
    SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT (SSDBM 2019), 2019, : 218 - 221