Mining and Indexing Graphs for Supergraph Search

被引:9
|
作者
Yuan, Dayu [1 ]
Mitra, Prasenjit [1 ,2 ]
Giles, C. Lee [1 ,2 ]
机构
[1] Penn State Univ, Dept Comp Sci & Engn, University Pk, PA 16802 USA
[2] Penn State Univ, Coll Informat Sci & Technol, University Pk, PA 16802 USA
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2013年 / 6卷 / 10期
基金
美国国家科学基金会;
关键词
D O I
10.14778/2536206.2536211
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We study supergraph search (SPS), that is, given a query graph q and a graph database G that contains a collection of graphs, return graphs that have q as a supergraph from G. SPS has broad applications in bioinformatics, cheminformatics and other scientific and commercial fields. Determining whether a graph is a subgraph (or supergraph) of another is an NP-complete problem. Hence, it is intractable to compute SPS for large graph databases. Two separate indexing methods, a "filter + verify"-based method and a "prefix-sharing"-based method, have been studied to efficiently compute SPS. To implement the above two methods, subgraph patterns are mined from the graph database to build an index. Those subgraphs are mined to optimize either the filtering gain or the prefix-sharing gain. However, no single subgraph-mining algorithm considers both gains. This work is the first one to mine subgraphs to optimize both the filtering gain and the prefix-sharing gain while processing SPS queries. First, we show that the subgraph-mining problem is NP-hard. Then, we propose two polynomial-time algorithms to solve the problem with an approximation ratio of 1 - 1/e and 1/4 respectively. In addition, we construct a lattice-like index, LW-index, to organize the selected subgraph patterns for fast index-lookup. Our experiments show that our approach improves the query processing time for SPS queries by a factor of 3 to 10.
引用
收藏
页码:829 / 840
页数:12
相关论文
共 50 条
  • [1] Similarity Search on Supergraph Containment
    Shang, Haichuan
    Zhu, Ke
    Lin, Xuemin
    Zhang, Ying
    Ichise, Ryutaro
    26TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING ICDE 2010, 2010, : 637 - 648
  • [2] Efficient Probabilistic Supergraph Search
    Zhang, Wenjie
    Lin, Xuemin
    Zhang, Ying
    Zhu, Ke
    Zhu, Gaoping
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2016, 28 (04) : 965 - 978
  • [3] Efficient Probabilistic Supergraph Search
    Zhang, Wenjie
    Lin, Xuemin
    Zhang, Ying
    Zhu, Ke
    Zhu, Gaoping
    2016 32ND IEEE INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2016, : 1542 - 1543
  • [4] On the minimum common supergraph of two graphs
    Bunke, H
    Jiang, X
    Kandel, A
    COMPUTING, 2000, 65 (01) : 13 - 25
  • [5] On the Minimum Common Supergraph of Two Graphs
    Bunke, H.
    Jiang, X.
    Kandel, A.
    Computing (Vienna/New York), 2000, 65 (01): : 13 - 25
  • [6] On the Minimum Common Supergraph of Two Graphs
    Horst Bunke
    Xiaoyi Jiang
    Abraham Kandel
    Computing, 2000, 65 (1) : 13 - 25
  • [7] Efficiently Indexing Large Sparse Graphs for Similarity Search
    Wang, Guoren
    Wang, Bin
    Yang, Xiaochun
    Yu, Ge
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2012, 24 (03) : 440 - 451
  • [8] Neural Similarity Search on Supergraph Containment
    Wang, Hanchen
    Yu, Jianke
    Wang, Xiaoyang
    Chen, Chen
    Zhang, Wenjie
    Lin, Xuemin
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (01) : 281 - 295
  • [9] A Generic Ontology Framework for Indexing Keyword Search on Massive Graphs
    Jiang, Jiaxin
    Choi, Byron
    Xu, Jianliang
    Bhowmick, Sourav S.
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2021, 33 (06) : 2322 - 2336
  • [10] Scalable Supergraph Search in Large Graph Databases
    Lyu, Bingqing
    Qin, Lu
    Lin, Xuemin
    Chang, Lijun
    Yu, Jeffrey Xu
    2016 32ND IEEE INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2016, : 157 - 168