Comprehensive mining of frequent itemsets for a combination of certain and uncertain databases

被引:0
|
作者
Wazir S. [1 ]
Beg M.M.S. [2 ]
Ahmad T. [1 ]
机构
[1] Department of Computer Engineering, Jamia Millia Islamia, New Delhi
[2] Department of Computer Engineering, Aligarh Muslim University, Aligarh
关键词
Approximate Frequent Items; Certain and Uncertain Transactional Database; Expected Support; Frequent Itemset Mining; Normal Distribution; Poisson Distribution;
D O I
10.1007/s41870-019-00310-0
中图分类号
学科分类号
摘要
The mechanism of Frequent Itemset Mining can be performed by using sequential algorithms like Apriori on a standalone system, or it can be applied using parallel algorithms like Count Distribution on a distributed system. Due to communication overhead in parallel algorithms and exponential candidate generation, many algorithms were developed for calculating frequent items either over the certain or uncertain database. Yet not a single algorithm is developed so far which can cover the requirement of generating frequent itemset by combining both the databases. We had proposed earlier MasterApriori algorithm which is used to calculate Approximate Frequent Items for a combination of certain and uncertain databases with the support of Apriori for Certain and Expected support based UApriori for the uncertain database. In this paper, the researcher would like to extend the former work by using Poisson and Normal Distribution based UApriori for the uncertain database. In proposed algorithms, there is only one-time communication between sites where data is distributed, which reduce the communication overhead. Scalability and efficiency of proposed algorithms are then checked by using standard, and synthetic databases. The performances were then measured by comparing time taken and a number of frequent items generated by each algorithm. © 2019, Bharati Vidyapeeth's Institute of Computer Applications and Management.
引用
收藏
页码:1205 / 1216
页数:11
相关论文
共 50 条
  • [1] Frequent Itemset Mining for a Combination of Certain and Uncertain Databases
    Wazir, Samar
    Ahmad, Tanvir
    Beg, M. M. Sufyan
    RECENT DEVELOPMENTS AND THE NEW DIRECTION IN SOFT-COMPUTING FOUNDATIONS AND APPLICATIONS, 2018, 361 : 25 - 39
  • [2] Mining Frequent Itemsets in Correlated Uncertain Databases
    Yong-Xin Tong
    Lei Chen
    Jieying She
    Journal of Computer Science and Technology, 2015, 30 : 696 - 712
  • [3] Mining Frequent Itemsets over Uncertain Databases
    Tong, Yongxin
    Chen, Lei
    Cheng, Yurong
    Yu, Philip S.
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2012, 5 (11): : 1650 - 1661
  • [4] Mining Frequent Itemsets in Correlated Uncertain Databases
    Tong, Yong-Xin
    Chen, Lei
    She, Jieying
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2015, 30 (04) : 696 - 712
  • [5] Efficient Mining of Weighted Frequent Itemsets in Uncertain Databases
    Lin, Jerry Chun-Wei
    Gan, Wensheng
    Fournier-Viger, Philippe
    Hong, Tzung-Pei
    MACHINE LEARNING AND DATA MINING IN PATTERN RECOGNITION (MLDM 2016), 2016, 9729 : 236 - 250
  • [6] Mining Probabilistic Frequent Closed Itemsets in Uncertain Databases
    Tang, Peiyi
    Peterson, Erich A.
    PROCEEDINGS OF THE 49TH ANNUAL ASSOCIATION FOR COMPUTING MACHINERY SOUTHEAST CONFERENCE (ACMSE '11), 2011, : 86 - 91
  • [7] On Efficient Mining of Frequent Itemsets from Big Uncertain Databases
    Ahsan Shah
    Zahid Halim
    Journal of Grid Computing, 2019, 17 : 831 - 850
  • [8] On Efficient Mining of Frequent Itemsets from Big Uncertain Databases
    Shah, Ahsan
    Halim, Zahid
    JOURNAL OF GRID COMPUTING, 2019, 17 (04) : 831 - 850
  • [9] Mining Weighted Frequent Itemsets without Candidate Generation in Uncertain Databases
    Lin, Jerry Chun-Wei
    Gan, Wensheng
    Fournier-Viger, Philippe
    Hong, Tzung-Pei
    Chao, Han-Chieh
    INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY & DECISION MAKING, 2017, 16 (06) : 1549 - 1579
  • [10] Mining frequent itemsets in distributed and dynamic databases
    Otey, ME
    Wang, C
    Parthasarathy, S
    Veloso, A
    Meira, W
    THIRD IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2003, : 617 - 620