Efficient Correlation Search from Graph Databases

被引:12
|
作者
Ke, Yiping [1 ]
Cheng, James [1 ]
Ng, Wilfred [1 ]
机构
[1] Hong Kong Univ Sci & Technol, Dept Comp Sci & Engn, Kowloon, Hong Kong, Peoples R China
关键词
Correlation; graph databases; Pearson's correlation coefficient;
D O I
10.1109/TKDE.2008.86
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a new problem of correlation mining from graph databases, called Correlated Graph Search (CGS). CGS adopts Pearson's correlation coefficient as the correlation measure to take into account the occurrence distributions of graphs. However, the CGS problem poses significant challenges, since every subgraph of a graph in the database is a candidate, but the number of subgraphs is exponential. We derive two necessary conditions that set bounds on the occurrence probability of a candidate in the database. With this result, we devise an efficient algorithm that mines the candidate set from a much smaller projected database, and thus, we are able to obtain a significantly smaller set of candidates. Three heuristic rules are further developed to refine the candidate set. We also make use of the bounds to directly answer high-support queries without mining the candidates. Our experimental results demonstrate the efficiency of our algorithm. Finally, we show that our algorithm provides a general solution when most of the commonly used correlation measures are used to generalize the CGS problem.
引用
收藏
页码:1601 / 1615
页数:15
相关论文
共 50 条
  • [41] Efficient similarity search for hierarchical data in large databases
    Kailing, K
    Kriegel, HP
    Schönauer, S
    Seidl, T
    ADVANCES IN DATABASE TECHNOLOGY - EDBT 2004, PROCEEDINGS, 2004, 2992 : 676 - 693
  • [42] An efficient search scheme for very large image databases
    Pramanik, S
    Li, JH
    Ruan, JD
    Bhattacharjee, SK
    INTERNET IMAGING, 2000, 3964 : 79 - 90
  • [43] Efficient motion search in large motion capture databases
    Yi Lin
    ADVANCES IN VISUAL COMPUTING, PT 1, 2006, 4291 : 151 - 160
  • [44] Efficient similarity search on multidimensional space of biometric databases
    Jayaraman, Umarani
    Gupta, Phalguni
    NEUROCOMPUTING, 2021, 452 : 623 - 652
  • [45] Efficient Similarity Search in Scientific Databases with Feature Signatures
    Uysal, Merih Seran
    Beecks, Christian
    Schmuecking, Jochen
    Seidl, Thomas
    PROCEEDINGS OF THE 27TH INTERNATIONAL CONFERENCE ON SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT, 2015,
  • [46] Efficient and Effective Aggregate Keyword Search on Relational Databases
    Li, Luping
    Petschulat, Stephen
    Tang, Guanting
    Pei, Jian
    Luk, Wo-Shun
    INTERNATIONAL JOURNAL OF DATA WAREHOUSING AND MINING, 2012, 8 (04) : 41 - 81
  • [47] Efficient keyword search across heterogeneous relational databases
    Sayyadian, Mayssam
    LeKhac, Hieu
    Doan, AnHai
    Gravano, Luis
    2007 IEEE 23RD INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2007, : 321 - +
  • [48] Similarity Search in Graph Databases: A Multi-layered Indexing Approach
    Liang, Yongjiang
    Zhao, Peixiang
    2017 IEEE 33RD INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2017), 2017, : 783 - 794
  • [49] Supergraph Search in Graph Databases via Hierarchical Feature-Tree
    Lyu, Bingqing
    Qin, Lu
    Lin, Xuemin
    Chang, Lijun
    Yu, Jeffrey Xu
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2019, 31 (02) : 385 - 400
  • [50] EmbAssi: embedding assignment costs for similarity search in large graph databases
    Bause, Franka
    Schubert, Erich
    Kriege, Nils M.
    DATA MINING AND KNOWLEDGE DISCOVERY, 2022, 36 (05) : 1728 - 1755