A Schema Feature Based Frequent Pattern Mining Algorithm for Semi-structured Data Stream

被引:0
|
作者
Fu, Weiqi [1 ]
Liao, Husheng [1 ]
Jin, Xueyun [1 ]
机构
[1] Beijing Univ Technol, Fac Informat Technol, Beijing, Peoples R China
基金
中国国家自然科学基金; 北京市自然科学基金;
关键词
frequent pattern mining; semi-structured data stream; schema feature;
D O I
暂无
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Data mining is used to find useful information from massive data. Frequent pattern mining is one important task of data mining. Recently, the researches on frequent pattern mining for semi-structured data have made some progresses, and it also have a lot of focuses for data stream. However, only a few studies focus on both semi-structured data and data stream. This paper proposes an algorithm named SPrefixTreeISpan. We segment the semi-structured data stream first, and then uses the pattern-growth method to mine each segment. In the end, we maintain all the results on a structure called patternTree. At the same time, the mining algorithm is optimized by the inevitable parent-child relationship and the inevitable child-parent relationship extracted from XML schema. Experiment shows that SPrefixTreeISpan has better performance.
引用
收藏
页码:1329 / 1336
页数:8
相关论文
共 50 条
  • [1] Semi-structured Data Extraction and Schema Knowledge Mining
    陈恩红
    High Technology Letters, 2001, (01) : 1 - 5
  • [2] Semi-structured data extraction and schema knowledge mining
    Chen, E.
    Wang, X.
    High Technology Letters, 2001, 7 (01) : 1 - 5
  • [3] Online algorithms for mining semi-structured data stream
    Asai, T
    Arimura, H
    Abe, K
    Kawasoe, S
    Arikawa, S
    2002 IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2002, : 27 - 34
  • [4] Schema based data storage and query optimization for semi-structured data
    Wang, QK
    Zhou, LZ
    WEB-AGE INFORMATION MANAGEMENT, PROCEEDINGS, 2000, 1846 : 389 - 398
  • [5] Schema discovery of the semi-structured and hierarchical data
    He, JW
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2002, 2002, 2412 : 129 - 134
  • [6] Schema Matching for Semi-structured and Linked Data
    Kettouch, Mohamed
    Luca, Cristina
    Hobbs, Mike
    2017 11TH IEEE INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC), 2017, : 270 - 271
  • [7] An Efficient Frequent Pattern Mining Algorithm for Data Stream
    Liu Hualei
    Lin Shukuan
    Qiao Jianzhong
    Yu Ge
    Lu Kaifu
    INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTATION TECHNOLOGY AND AUTOMATION, VOL 1, PROCEEDINGS, 2008, : 757 - 761
  • [8] A Real-Time Frequent Pattern Mining Algorithm for Semi Structured Data Streams
    Tong, Ziqi
    Liao, Husheng
    Jin, Xueyun
    2017 IEEE 2ND INTERNATIONAL CONFERENCE ON BIG DATA ANALYSIS (ICBDA), 2017, : 274 - 280
  • [9] Survey on Mining in Semi-Structured Data
    Shettar, Rajashree
    Shobha, G.
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2007, 7 (08): : 226 - 231
  • [10] Efficient algorithms for mining frequent and closed patterns from semi-structured data
    Arimura, Hiroki
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2008, 5012 : 2 - +