Summarizing Large-Scale Database Schema Using Community Detection

被引:0
|
作者
王雪
周烜
王珊
机构
[1] School of Information,Renmin University of China
[2] Key Laboratory of Data Engineering and Knowledge Engineering,Renmin University of China
基金
中国国家自然科学基金;
关键词
schema; summarization; large scale; community detection;
D O I
暂无
中图分类号
TP311.13 [];
学科分类号
1201 ;
摘要
Schema summarization on large-scale databases is a challenge.In a typical large database schema,a great proportion of the tables are closely connected through a few high degree tables.It is thus difficult to separate these tables into clusters that represent different topics.Moreover,as a schema can be very big,the schema summary needs to be structured into multiple levels,to further improve the usability.In this paper,we introduce a new schema summarization approach utilizing the techniques of community detection in social networks.Our approach contains three steps.First,we use a community detection algorithm to divide a database schema into subject groups,each representing a specific subject.Second,we cluster the subject groups into abstract domains to form a multi-level navigation structure.Third,we discover representative tables in each cluster to label the schema summary.We evaluate our approach on Freebase,a real world large-scale database.The results show that our approach can identify subject groups precisely.The generated abstract schema layers are very helpful for users to explore database.
引用
收藏
页码:515 / 526
页数:12
相关论文
共 50 条
  • [21] Attention Based Glaucoma Detection: A Large-scale Database and CNN Model
    Li, Liu
    Xu, Mai
    Wang, Xiaofei
    Jiang, Lai
    Liu, Hanruo
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 10563 - 10572
  • [22] Schema mediation for large-scale semantic data sharing
    Alon Y. Halevy
    Zachary G. Ives
    Dan Suciu
    Igor Tatarinov
    The VLDB Journal, 2005, 14 : 68 - 83
  • [23] A Distributed Algorithm for Overlapped Community Detection in Large-Scale Networks
    Saha, Dibakar
    Mandal, Partha Sarathi
    2021 INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS & NETWORKS (COMSNETS), 2021, : 483 - 491
  • [24] Schema mediation for large-scale semantic data sharing
    Halevy, AY
    Ives, ZG
    Suciu, D
    Tatarinov, I
    VLDB JOURNAL, 2005, 14 (01): : 68 - 83
  • [25] A UNIFIED COMMUNITY DETECTION ALGORITHM IN LARGE-SCALE COMPLEX NETWORKS
    Long, Hao
    Liu, Xiao-Wei
    ADVANCES IN COMPLEX SYSTEMS, 2019, 22 (03):
  • [26] Effectively Unified Optimization for Large-scale Graph Community Detection
    Zeng, Jianping
    Yu, Hongfeng
    2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, : 475 - 482
  • [27] Community Detection Based on DeepWalk Model in Large-Scale Networks
    Chen, Yunfang
    Wang, Li
    Qi, Dehao
    Ma, Tinghuai
    Zhang, Wei
    SECURITY AND COMMUNICATION NETWORKS, 2020, 2020
  • [28] Large-scale community detection based on a new dissimilarity measure
    Asmi K.
    Lotfi D.
    El Marraki M.
    Social Network Analysis and Mining, 2017, 7 (1)
  • [29] Community detection in large-scale networks: a survey and empirical evaluation
    Harenberg, Steve
    Bello, Gonzalo
    Gjeltema, L.
    Ranshous, Stephen
    Harlalka, Jitendra
    Seay, Ramona
    Padmanabhan, Kanchana
    Samatova, Nagiza
    WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2014, 6 (06) : 426 - 439
  • [30] Towards Online Multiresolution Community Detection in Large-Scale Networks
    Huang, Jianbin
    Sun, Heli
    Liu, Yaguang
    Song, Qinbao
    Weninger, Tim
    PLOS ONE, 2011, 6 (08):