A two-phase approach for unexpected pattern mining

被引:2
|
作者
Zhang, Jingtian [1 ]
Shou, Lidan [1 ]
Wu, Sai [1 ]
Chen, Gang [1 ]
Chen, Ke [1 ]
机构
[1] Zhejiang Univ, Dept Comp Sci & Technol, Hangzhou 310027, Zhejiang, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Frequent pattern mining; Subgroup discovery; Multi-dimensional dataset; Data mining; Anomaly detection; SUBGROUP DISCOVERY; FAST ALGORITHM; EFFICIENT; SD;
D O I
10.1016/j.eswa.2019.112946
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A typical mining task is to retrieve all frequent patterns from a multi-dimensional dataset. Those patterns give us a basic idea of how the data look like and the hidden inherent regularities. However, this is only useful for an unfamiliar dataset, while for datasets that are analyzed periodically, "unexpected" patterns are more interesting (e.g., some customers decided to subscribe to long-term deposits despite the burden of housing loan). In this paper, we propose a new mining job, unexpected mining, which targets at retrieving frequent patterns that are not valid in a reference dataset, but are significant enough in a specific subgroup. Given a reference dataset, we step by step generate all unexpected patterns for all subgroups. We extend existing mining approaches to support the new mining job efficiently. In particular, our scheme consists of an offline process and an online process. Offline process generates candidate patterns and builds an index table. Online process can retrieve unexpected patterns from user-defined subgroups and a given support. Experiments on real datasets show that our approach can find interesting patterns and is very efficient compared to existing approaches. (C) 2019 Elsevier Ltd. All rights reserved.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] A Two Stage Approach for Contiguous Sequential Pattern Mining
    Chen, Jinlin
    Shankar, Subash
    Kelly, Angela
    Gningue, Sergine
    Rajaravivarma, Rathika
    PROCEEDINGS OF THE 2009 IEEE INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION, 2008, : 382 - +
  • [22] Adaptive Two-Phase Spatial Association Rules Mining Method
    Lee, Chin-Feng
    Chen, Mei-Hsiu
    JOURNAL OF COMPUTER SCIENCE & TECHNOLOGY, 2006, 6 (01): : 36 - 45
  • [23] A Two-Phase Algorithm for Differentially Private Frequent Subgraph Mining
    Cheng, Xiang
    Su, Sen
    Xu, Shengzhi
    Xiong, Li
    Xiao, Ke
    Zhao, Mingxing
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2018, 30 (08) : 1411 - 1425
  • [24] Mining entity latent semantic relationships by two-phase clustering
    Zhao, Ke
    Li, Qingzhong
    Yan, Zhongmin
    Li, Hui
    Chen, Zhiyong
    Journal of Computational Information Systems, 2015, 11 (21): : 7731 - 7739
  • [25] An extended two-phase architecture for mining time series data
    Chen, AP
    Chen, YC
    Hsu, NW
    KNOWLEDGE-BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT 1, PROCEEDINGS, 2005, 3681 : 1186 - 1192
  • [26] Two-phase data types transformation framework in data mining
    Jiang, MF
    Tseng, SS
    Liao, SY
    Chen, WC
    KNOWLEDGE-BASED INTELLIGENT INFORMATION ENGINEERING SYSTEMS & ALLIED TECHNOLOGIES, PTS 1 AND 2, 2001, 69 : 490 - 494
  • [27] A Two-phase Evolutionary Algorithm for Multiobjective Mining of Classification Rules
    Chan, Yung-Hsiang
    Chiang, Tsung-Che
    Fu, Li-Chen
    2010 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2010,
  • [28] A Two-Phase Algorithm for Mining Sequential Patterns with Differential Privacy
    Bonomi, Luca
    Xiong, Li
    PROCEEDINGS OF THE 22ND ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM'13), 2013, : 269 - 278
  • [29] A two-phase approach for the Radiotherapy Scheduling Problem
    Pham, Tu-San
    Rousseau, Louis-Martin
    De Causmaecker, Patrick
    HEALTH CARE MANAGEMENT SCIENCE, 2022, 25 (02) : 191 - 207
  • [30] A Continuum Approach to Two-Phase Porous Media
    JiŘí Mls
    Transport in Porous Media, 1999, 35 : 15 - 36