An Algorithm of Semi-structured Data Scheme Extraction Based on OEM Model

被引:0
|
作者
Gong, An [1 ]
Yang, Xue-wei [1 ]
机构
[1] China Univ Petr E China, Coll Comp & Commun Engn, Dongying 257061, Peoples R China
关键词
Semi-structured data; frequent patterns mining; OEM; the longest frequent label path;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In order to get the target model of semi-structured data rapidly, effectively and accurately, by combining the related nature of label path in the paper, this paper proposes an algorithm that can extract target model from the OEM model of semi-structured data directly. The basic idea of the Algorithm is: Using a Depth_First Search to get all of the label path expressions, with the help of the nature2 in this paper can reducing the number of path matching, we can generate all frequent label path expressions by layer. Finally, with the strategy of deletion we can get all of the longest frequent label path expressions effectively. Theoretical analysis and Experimental result shows that this algorithm can improve the accuracy of target model and reduce the size of candidate sets in pattern extraction.
引用
收藏
页码:315 / 319
页数:5
相关论文
共 50 条
  • [32] A view-based approach to the integration of structured and semi-structured data
    Ahmad, Honda
    Kermanshahani, Shokooh
    Simonet, Ana
    Simonet, Michel
    DATABASES AND INFORMATION SYSTEMS: COMMUNICATIONS, MATERIALS OF DOCTORAL CONSORTIUM, 2006, : 41 - 51
  • [33] A Schema Feature Based Frequent Pattern Mining Algorithm for Semi-structured Data Stream
    Fu, Weiqi
    Liao, Husheng
    Jin, Xueyun
    PROCEEDINGS OF THE 2017 5TH INTERNATIONAL CONFERENCE ON FRONTIERS OF MANUFACTURING SCIENCE AND MEASURING TECHNOLOGY (FMSMT 2017), 2017, 130 : 1329 - 1336
  • [34] Multi-level schema extraction for heterogeneous semi-structured data
    Yoon, JP
    Raghavan, V
    WEB-AGE INFORMATION MANAGEMENT, PROCEEDINGS, 2000, 1846 : 411 - 422
  • [35] An Algorithm for Constructing a Topological Skeleton for Semi-structured Spatial Data Based on Persistent Homology
    Eremeev, Sergey
    Romanov, Semyon
    ANALYSIS OF IMAGES, SOCIAL NETWORKS AND TEXTS (AIST 2019), 2020, 1086 : 16 - 26
  • [36] Semi-Structured Data Model for Big Data (SS-DMBD)
    Hamouda, Shady
    Zainol, Zurinahni
    PROCEEDINGS OF THE 8TH INTERNATIONAL CONFERENCE ON DATA SCIENCE, TECHNOLOGY AND APPLICATIONS (DATA), 2019, : 348 - 356
  • [37] Automatic Content Extraction on Semi-Structured Documents
    dos Santos, Jose Eduardo Bastos
    11TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2011), 2011, : 1235 - 1239
  • [38] Query optimization for semi-structured data
    Li, GY
    Bian, S
    Zhang, J
    Xie, Y
    PROCEEDINGS OF THE 2004 INTERNATIONAL CONFERENCE ON MANAGEMENT SCIENCE & ENGINEERING, VOLS 1 AND 2, 2004, : 97 - 100
  • [39] Survey on Mining in Semi-Structured Data
    Shettar, Rajashree
    Shobha, G.
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2007, 7 (08): : 226 - 231
  • [40] Low-Dimensionality Information Extraction Model for Semi-structured Documents
    Belhadj, Djedjiga
    Belaid, Abdel
    Belaid, Yolande
    COMPUTER ANALYSIS OF IMAGES AND PATTERNS, CAIP 2023, PT I, 2023, 14184 : 76 - 85