Sampling Based on Genetic Algorithm for Data Mining

被引:0
|
作者
Wang Jianyong [1 ]
Huang Yu [1 ]
Hu Bin [1 ]
Wei Xiaomei [1 ]
机构
[1] Huazhong Agr Univ, Coll Sci, Wuhan 430070, Hubei Province, Peoples R China
关键词
Genetic algorithm; Association rules; Accuracy;
D O I
暂无
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Collecting a large initial sample set from a huge data set, and then distilling a smaller sample set from the initial set in the same accuracy can greatly enhance the speeds of data mining algorithms. As the distilling process is proved as a NP-hard problem, the two-phase sampling algorithm FAST adopts a kind of geed method. Adopting genetic algorithm in sample distilling, a sampling algorithm SGA is presented in this paper, which performs better than popular sampling algorithms including FAST in the experiment.
引用
收藏
页码:3667 / 3672
页数:6
相关论文
共 50 条
  • [41] Application of ant colony, genetic algorithm and data mining-based techniques for scheduling
    Kumar, Surendra
    Rao, C. S. P.
    ROBOTICS AND COMPUTER-INTEGRATED MANUFACTURING, 2009, 25 (06) : 901 - 908
  • [42] Efficient genetic algorithm based data mining using feature selection with Hausdorff distance
    Sikora R.
    Piramuthu S.
    Information Technology and Management, 2005, 6 (4) : 315 - 331
  • [43] Data Mining for Knowledge System of Miner Safety Behavior Indexes Based on Genetic Algorithm
    Zhou Gang
    Cheng Weimin
    Chen Lianjun
    2010 INTERNATIONAL CONFERENCE ON MINE HAZARDS PREVENTION AND CONTROL, 2010, 12 : 652 - 658
  • [44] Algorithm for classification of biological data based on data mining
    Garcia, Eduardo Moniz
    Fonseca, Simone A. S.
    Beingolea, Jorge R.
    PROCEEDINGS OF THE 2019 IEEE 1ST SUSTAINABLE CITIES LATIN AMERICA CONFERENCE (SCLA), 2019,
  • [45] An algorithm of association rules mining in large databases based on sampling
    Liu, Zhi
    Sun, Tianhong
    Sang, Guoming
    International Journal of Database Theory and Application, 2013, 6 (06): : 95 - 104
  • [46] Research of the Optimization of a Data Mining Algorithm Based on an Embedded Data Mining System
    Wang, Xindi
    Chen, Mengfei
    Chen, Li
    CYBERNETICS AND INFORMATION TECHNOLOGIES, 2013, 13 (13) : 5 - 17
  • [47] Model Training Task Scheduling Algorithm Based on Greedy-Genetic Algorithm for Big-Data Mining
    Wang, Yiqi
    Sun, Yipin
    Zhang, Ziwei
    2018 INTERNATIONAL CONFERENCE ON COMPUTER INFORMATION SCIENCE AND APPLICATION TECHNOLOGY, 2019, 1168
  • [48] Quick Response Data Mining Model Using Genetic Algorithm
    Dou, Wenxiang
    Hu, Jinglu
    Hirasawa, Kotaro
    Wu, Gengfeng
    2008 PROCEEDINGS OF SICE ANNUAL CONFERENCE, VOLS 1-7, 2008, : 1166 - +
  • [49] Using genetic algorithm for data mining optimization in an image database
    Gao, Li
    Dai, Shangping
    Zheng, Shijue
    Yan, Guanxiang
    FOURTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 3, PROCEEDINGS, 2007, : 721 - +
  • [50] A Genetic Algorithm AND Data Mining to resolve a job shop schedule
    Harrath, Y
    Morello, B
    Zerhouni, N
    ETFA 2001: 8TH IEEE INTERNATIONAL CONFERENCE ON EMERGING TECHNOLOGIES AND FACTORY AUTOMATION, VOL 2, PROCEEDINGS, 2001, : 727 - 728