ASSOCIATION-RULES-BASED DATA IMPUTATION WITH SPARK

被引:0
|
作者
Qu, Zhaowei [1 ]
Yan, Jianru [1 ]
Yin, Sixing [1 ]
机构
[1] Beijing Univ Posts & Telecommun, Beijing 100876, Peoples R China
关键词
Association rules; Data preprocessing; Spark; Distributed algorithm;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Due to technical bottlenecks and errors caused by artificial operation, the problem of incomplete data always exists in big data research. Traditional data imputation algorithms incur high complexity and the accuracy cannot reach the desired level. At the same time, analysis and computation involved in mass data makes limitation of traditional algorithms and computing platform more noticeable. In this paper, we propose a data imputation method based on Apriori algorithm, and implement the corresponding algorithm on the distributed computing system built with Spark, The experimental results show that the proposed algorithm outperforms a traditional data imputation algorithm in terms of efficiency and accuracy.
引用
收藏
页码:145 / 149
页数:5
相关论文
共 50 条
  • [31] Association Rules based Data Mining on Test Data of Physical Health Standard
    Yu, Lan
    INTERNATIONAL JOINT CONFERENCE ON COMPUTATIONAL SCIENCES AND OPTIMIZATION, VOL 2, PROCEEDINGS, 2009, : 322 - 324
  • [32] One-Dimensional Preference Data Imputation Through Transition Rules
    Fabbris, Luigi
    CLASSIFICATION AND MULTIVARIATE ANALYSIS FOR COMPLEX DATA STRUCTURES, 2011, : 245 - 252
  • [33] Missing Nominal Data Imputation Using Association Rule Based on Weighted Voting Method
    Wu, Jianhua
    Song, Qinbao
    Shen, Junyi
    2008 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-8, 2008, : 1157 - 1162
  • [34] Dealing with missing data in family-based association studies:: A multiple imputation approach
    Croiseau, Pascal
    Genin, Emmanuelle
    Cordell, Heather J.
    HUMAN HEREDITY, 2007, 63 (3-4) : 229 - 238
  • [35] Evolving Clustering Based Data Imputation
    Gautam, Chandan
    Ravi, Vadlamani
    2014 IEEE INTERNATIONAL CONFERENCE ON CIRCUIT, POWER AND COMPUTING TECHNOLOGIES (ICCPCT-2014), 2014, : 1763 - 1769
  • [36] Ensemble based Data Imputation at the Edge
    Fountas, Panagiotis
    Kolomvatsos, Kostas
    2020 IEEE 32ND INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2020, : 961 - 968
  • [37] Measurement, selection, and visualization of association rules: A compositional data perspective A Compositional Data perspective on Association Rules
    Vives-Mestres, Marina
    Kenett, Ron S.
    Thio-Henestrosa, Santiago
    Martin-Fernandez, Josep Antoni
    QUALITY AND RELIABILITY ENGINEERING INTERNATIONAL, 2022, 38 (03) : 1327 - 1339
  • [38] Action Rules for Sentiment Analysis on Twitter Data using Spark
    Ranganathan, Jaishree
    Irudayaraj, Allen S.
    Tzacheva, Angelina A.
    2017 17TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2017), 2017, : 51 - 60
  • [39] An improved approach for mining association rules in parallel using Spark Streaming
    Liu, Longtao
    Wen, Jiabao
    Zheng, Zexun
    Su, Hansong
    INTERNATIONAL JOURNAL OF CIRCUIT THEORY AND APPLICATIONS, 2021, 49 (04) : 1028 - 1039
  • [40] Association rule mining algorithm based on Spark for pesticide transaction data analyses
    Bai, Xiaoning
    Jia, Jingdun
    Wei, Qiwen
    Huang, Shuaiqi
    Du, Weicheng
    Gao, Wanlin
    INTERNATIONAL JOURNAL OF AGRICULTURAL AND BIOLOGICAL ENGINEERING, 2019, 12 (05) : 162 - 166