Similarity-based data reduction techniques

被引:0
|
作者
Guo, G [1 ]
Wang, H
Bell, D
机构
[1] Univ Ulster, Sch Comp & Math, Coleraine BT37 0QB, Londonderry, North Ireland
[2] Univ Bradford, Dept Comp, Bradford BD7 1DP, W Yorkshire, England
[3] Queens Univ Belfast, Sch Comp Sci, Belfast BT7 1NN, Antrim, North Ireland
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The k-nearest neighbours (kNN) is a simple but effective method for classification. Its major drawbacks are (1) low efficiency, and (2) dependency on the selection of a "good value" for k. In this paper, we propose a novel similarity-based data reduction method (SBModel) together with three variants aimed at overcoming these shortcomings. Our method constructs a similarity-based model for the data, which replaces the data to serve as the basis of classification. The value of k is automatically determined, is varied in terms of local data distribution, and is optimal in terms of classification accuracy. The construction of the model significantly reduces the amount of data needed for classification, thus making classification faster. Experiments conducted on some public data sets show that SBModel and its variants compare well with C5.0, kNN, wkNN, and other data reduction methods in both efficiency and effectiveness.
引用
收藏
页码:211 / 232
页数:22
相关论文
共 50 条
  • [41] Visually exploring movement data via similarity-based analysis
    Pelekis, Nikos
    Andrienko, Gennady
    Andrienko, Natalia
    Kopanakis, Ioannis
    Marketos, Gerasimos
    Theodoridis, Yannis
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2012, 38 (02) : 343 - 391
  • [42] Incremental Matrix Reordering for Similarity-Based Dynamic Data Sets
    Rastin, Parisa
    Matei, Basarab
    NEURAL INFORMATION PROCESSING, ICONIP 2017, PT V, 2017, 10638 : 76 - 84
  • [43] FAILURE PROGNOSTICS BY A DATA-DRIVEN SIMILARITY-BASED APPROACH
    Di Maio, Francesco
    Zio, Enrico
    INTERNATIONAL JOURNAL OF RELIABILITY QUALITY & SAFETY ENGINEERING, 2013, 20 (01):
  • [44] Similarity-based attribute reduction in rough set theory: a clustering perspective
    Jia, Xiuyi
    Rao, Ya
    Shang, Lin
    Li, Tongjun
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2020, 11 (05) : 1047 - 1060
  • [45] Similarity-Based Three-Way Clustering by Using Dimensionality Reduction
    Li, Anlong
    Meng, Yiping
    Wang, Pingxin
    MATHEMATICS, 2024, 12 (13)
  • [46] Similarity-Based Trust Management System: Data Validation Scheme
    Al Falasi, Hind
    Mohamed, Nader
    El-Syed, Hesham
    HYBRID INTELLIGENT SYSTEMS, HIS 2015, 2016, 420 : 141 - 153
  • [47] Similarity of Query Results in Similarity-Based Databases
    Belohlavek, Radim
    Urbanova, Lucie
    Vychodil, Vilem
    ROUGH SETS AND KNOWLEDGE TECHNOLOGY, 2011, 6954 : 258 - 267
  • [48] A similarity-based resolution rule
    Fontana, FA
    Formato, F
    INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2002, 17 (09) : 853 - 872
  • [49] Similarity-based Product Configuration
    Schuh, Guenther
    Rudolf, Stefan
    Riesener, Michael
    VARIETY MANAGEMENT IN MANUFACTURING: PROCEEDINGS OF THE 47TH CIRP CONFERENCE ON MANUFACTURING SYSTEMS, 2014, 17 : 290 - 295
  • [50] A similarity-based approach to aggregation
    Jacas, J
    Recasens, J
    FUZZ-IEEE 2005: PROCEEDINGS OF THE IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS: BIGGEST LITTLE CONFERENCE IN THE WORLD, 2005, : 658 - 662