Mining and Indexing Graphs for Supergraph Search

被引:9
|
作者
Yuan, Dayu [1 ]
Mitra, Prasenjit [1 ,2 ]
Giles, C. Lee [1 ,2 ]
机构
[1] Penn State Univ, Dept Comp Sci & Engn, University Pk, PA 16802 USA
[2] Penn State Univ, Coll Informat Sci & Technol, University Pk, PA 16802 USA
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2013年 / 6卷 / 10期
基金
美国国家科学基金会;
关键词
D O I
10.14778/2536206.2536211
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We study supergraph search (SPS), that is, given a query graph q and a graph database G that contains a collection of graphs, return graphs that have q as a supergraph from G. SPS has broad applications in bioinformatics, cheminformatics and other scientific and commercial fields. Determining whether a graph is a subgraph (or supergraph) of another is an NP-complete problem. Hence, it is intractable to compute SPS for large graph databases. Two separate indexing methods, a "filter + verify"-based method and a "prefix-sharing"-based method, have been studied to efficiently compute SPS. To implement the above two methods, subgraph patterns are mined from the graph database to build an index. Those subgraphs are mined to optimize either the filtering gain or the prefix-sharing gain. However, no single subgraph-mining algorithm considers both gains. This work is the first one to mine subgraphs to optimize both the filtering gain and the prefix-sharing gain while processing SPS queries. First, we show that the subgraph-mining problem is NP-hard. Then, we propose two polynomial-time algorithms to solve the problem with an approximation ratio of 1 - 1/e and 1/4 respectively. In addition, we construct a lattice-like index, LW-index, to organize the selected subgraph patterns for fast index-lookup. Our experiments show that our approach improves the query processing time for SPS queries by a factor of 3 to 10.
引用
收藏
页码:829 / 840
页数:12
相关论文
共 50 条
  • [41] INDEXING THEORY, INDEXING METHODS AND SEARCH DEVICES - JONKER,F
    MILLS, J
    JOURNAL OF DOCUMENTATION, 1965, 21 (01) : 58 - 60
  • [42] Concept mining for indexing medical literature
    Bichindaritz, Isabelle
    Akkineni, Sarada
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2006, 19 (04) : 411 - 417
  • [43] Concept mining for indexing medical literature
    Bichindaritz, I
    Akkineni, S
    MACHINE LEARNING AND DATA MINING IN PATTERN RECOGNITION, PROCEEDINGS, 2005, 3587 : 682 - 691
  • [44] VIBGYOR INDEXING TECHNIQUE FOR IMAGE MINING
    Tarulatha, Balvant
    Shroff, Namrata
    Chaudhary, M. B.
    PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON DATA MINING AND ADVANCED COMPUTING (SAPIENCE), 2016, : 191 - 193
  • [45] Indexing Evolving Databases for Itemset Mining
    Baralis, Elena
    Cerquitelli, Tania
    Chiusano, Silvia
    INTELLIGENT TECHNIQUES AND TOOLS FOR NOVEL SYSTEM ARCHITECTURES, 2008, 109 : 305 - 323
  • [46] Probabilistic Indexing and Search for Hyphenated Words
    Vidal, Enrique
    Toselli, Alejandro H.
    DOCUMENT ANALYSIS AND RECOGNITION - ICDAR 2021, PT II, 2021, 12822 : 426 - 442
  • [47] Range Search in kdSLst Indexing Structure
    Meenakshi
    Gill, Sumeet
    PROCEEDINGS OF THE 2019 6TH INTERNATIONAL CONFERENCE ON COMPUTING FOR SUSTAINABLE GLOBAL DEVELOPMENT (INDIACOM), 2019, : 260 - 263
  • [48] Indexing and search methods for spoken documents
    Burget, Lukas
    Cernocky, Jan
    Fapso, Michal
    Karafiat, Martin
    Matejka, Pavel
    Schwarz, Petr
    Smrz, Pavel
    Szoke, Igor
    TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2006, 4188 : 351 - 358
  • [49] ENUMERATION OF PATHSETS OF RELIABILITY GRAPHS BY REPEATED INDEXING
    AZIZ, MA
    SOBHAN, MA
    SAMAD, MA
    MICROELECTRONICS RELIABILITY, 1993, 33 (04) : 481 - 487
  • [50] Indexing for Keyword Search with Structured Constraints
    Lu, Shangqi
    Tao, Yufei
    PROCEEDINGS OF THE 42ND ACM SIGMOD-SIGACT-SIGAI SYMPOSIUM ON PRINCIPLES OF DATABASE SYSTEMS, PODS 2023, 2023, : 263 - 275