Scalable inductive learning on partitioned data

被引:0
|
作者
Chen, QJ [1 ]
Wu, XD [1 ]
Zhu, XQ [1 ]
机构
[1] Univ Vermont, Dept Comp Sci, Burlington, VT 05405 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the rapid advancement of information technology, scalability has become a necessity for learning algorithms to deal with large, real-world data repositories. In this paper, scalability is accomplished through a data reduction technique, which partitions a large data set into subsets, applies a learning algorithm on each subset sequentially or concurrently, and then integrates the learned results. Five strategies to achieve scalability (Rule-Example Conversion, Rule Weighting, Iteration, Good Rule Selection, and Data Dependent Rule Selection) are identified and seven corresponding scalable schemes are designed and developed. A substantial number of experiments have been performed to evaluate these schemes. Experimental results demonstrate that through data reduction some of our schemes can effectively generate accurate classifiers from weak classifiers generated from data subsets. Furthermore, our schemes require significantly less training time than that of generating a global classifier.
引用
收藏
页码:391 / 403
页数:13
相关论文
共 50 条
  • [1] Scalable Privacy-Preserving Data Mining with Asynchronously Partitioned Datasets
    Kikuchi, Hiroaki
    Kagawa, Daisuke
    Basu, Anirban
    Ishii, Kazuhiko
    Terada, Masayuki
    Honga, Sadayuki
    FUTURE CHALLENGES IN SECURITY AND PRIVACY FOR ACADEMIA AND INDUSTRY, 2011, 354 : 223 - +
  • [2] Scalable Privacy-Preserving Data Mining with Asynchronously Partitioned Datasets
    Kikuchi, Hiroaki
    Kagawa, Daisuke
    Basu, Anirban
    Ishii, Kazuhiko
    Terada, Masayuki
    Hongo, Sadayuki
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2013, E96A (01) : 111 - 120
  • [3] Deep Learning Using Partitioned Data Vectors
    Mitchell, Ben
    Tosun, Hasari
    Sheppard, John
    2015 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2015,
  • [4] A fault-tolerant and scalable boosting method over vertically partitioned data
    Jiang, Hai
    Shang, Songtao
    Liu, Peng
    Yi, Tong
    CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY, 2024, 9 (05) : 1092 - 1100
  • [5] Communication Efficient Distributed Learning with Feature Partitioned Data
    Zhang, Bingwen
    Geng, Jun
    Xu, Weiyu
    Lai, Lifeng
    2018 52ND ANNUAL CONFERENCE ON INFORMATION SCIENCES AND SYSTEMS (CISS), 2018,
  • [6] Partitioned learning of deep Boltzmann machines for SNP data
    Hess, Moritz
    Lenz, Stefan
    Blaette, Tamara J.
    Bullinger, Lars
    Binder, Harald
    BIOINFORMATICS, 2017, 33 (20) : 3173 - 3180
  • [7] FOLD-R plus plus : A Scalable Toolset for Automated Inductive Learning of Default Theories from Mixed Data
    Wang, Huaduo
    Gupta, Gopal
    FUNCTIONAL AND LOGIC PROGRAMMING, FLOPS 2022, 2022, 13215 : 224 - 242
  • [8] FOLD-RM: A Scalable, Efficient, and Explainable Inductive Learning Algorithm for Multi-Category Classification of Mixed Data
    Wang, Huaduo
    Shakerin, Farhad
    Gupta, Gopal
    THEORY AND PRACTICE OF LOGIC PROGRAMMING, 2022, 22 (05) : 658 - 677
  • [9] Overcoming graph topology imbalance for inductive and scalable semi-supervised learning
    Dornaika, F.
    Ibrahim, Z.
    Bosaghzadeh, A.
    APPLIED SOFT COMPUTING, 2024, 151
  • [10] MULTI-TIER FEDERATED LEARNING FOR VERTICALLY PARTITIONED DATA
    Das, Anirban
    Patterson, Stacy
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 3100 - 3104