Summarizing Large-Scale Database Schema Using Community Detection

被引:0
|
作者
王雪
周烜
王珊
机构
[1] School of Information,Renmin University of China
[2] Key Laboratory of Data Engineering and Knowledge Engineering,Renmin University of China
基金
中国国家自然科学基金;
关键词
schema; summarization; large scale; community detection;
D O I
暂无
中图分类号
TP311.13 [];
学科分类号
1201 ;
摘要
Schema summarization on large-scale databases is a challenge.In a typical large database schema,a great proportion of the tables are closely connected through a few high degree tables.It is thus difficult to separate these tables into clusters that represent different topics.Moreover,as a schema can be very big,the schema summary needs to be structured into multiple levels,to further improve the usability.In this paper,we introduce a new schema summarization approach utilizing the techniques of community detection in social networks.Our approach contains three steps.First,we use a community detection algorithm to divide a database schema into subject groups,each representing a specific subject.Second,we cluster the subject groups into abstract domains to form a multi-level navigation structure.Third,we discover representative tables in each cluster to label the schema summary.We evaluate our approach on Freebase,a real world large-scale database.The results show that our approach can identify subject groups precisely.The generated abstract schema layers are very helpful for users to explore database.
引用
收藏
页码:515 / 526
页数:12
相关论文
共 50 条
  • [1] Summarizing Large-Scale Database Schema Using Community Detection
    Xue Wang
    Xuan Zhou
    Shan Wang
    Journal of Computer Science and Technology, 2012, 27 : 515 - 526
  • [2] Summarizing Large-Scale Database Schema Using Community Detection
    Wang, Xue
    Zhou, Xuan
    Wang, Shan
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2012, 27 (03) : 515 - 526
  • [3] Large-Scale Graphs Community Detection using Spark GraphFrames
    Apostol, Elena-Simona
    Cojocaru, Adrian-Cosmin
    Truica, Ciprian-Octavian
    2024 23RD INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED COMPUTING, ISPDC 2024, 2024,
  • [4] Summarizing database schema based on graph partition
    Wang, Yingqi
    Zhou, Lianke
    Wang, Nianbin
    MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (08) : 10077 - 10096
  • [5] Summarizing database schema based on graph partition
    Yingqi Wang
    Lianke Zhou
    Nianbin Wang
    Multimedia Tools and Applications, 2019, 78 : 10077 - 10096
  • [6] Community Detection in Large-scale Bipartite Networks
    Liu, Xin
    Murata, Tsuyoshi
    2009 IEEE/WIC/ACM INTERNATIONAL JOINT CONFERENCES ON WEB INTELLIGENCE (WI) AND INTELLIGENT AGENT TECHNOLOGIES (IAT), VOL 1, 2009, : 50 - 57
  • [7] Summarizing Relational Database Schema Based on Label Propagation
    Yuan, Xiaojie
    Li, Xinkun
    Yu, Man
    Cai, Xiangrui
    Zhang, Ying
    Wen, Yanlong
    WEB TECHNOLOGIES AND APPLICATIONS, APWEB 2014, 2014, 8709 : 258 - 269
  • [8] A database schema for large scale annotated image dataset
    Peng, Shaowu
    Liu, Leyuan
    Yang, Xiong
    Sang, Nong
    CISP 2008: FIRST INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, VOL 3, PROCEEDINGS, 2008, : 57 - 62
  • [9] VCDB: A Large-Scale Database for Partial Copy Detection in Videos
    Jiang, Yu-Gang
    Jiang, Yudong
    Wang, Jiajun
    COMPUTER VISION - ECCV 2014, PT IV, 2014, 8692 : 357 - 371
  • [10] Hybrid schema summarization method of large scale database
    Wang, Xue
    Zhou, Xuan
    Wang, Shan
    Jisuanji Xuebao/Chinese Journal of Computers, 2013, 36 (08): : 1616 - 1625